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ABSTRACT 


This study continues a previous face recognition investigation using uncooled in- 
frared technology. The database developed in an earlier study is further expanded to in- 
clude 50 volunteers with 30 facial images from each subject. The automatic image reduc- 
tion method reduces the pixel size of each image from 160X120 to60x45 . The study re- 
examines two linear classification methods: the Principal Component Analysis (PCA) 
and Fisher Linear Discriminant Analysis (LDA). Both PCA and LDA apply eigenvectors 
and eigenvalues concepts. In addition, the Singular Value Decomposition based Snapshot 
method is applied to decrease the computational load. The K-fold Cross Validation is ap- 
plied to estimate classification performances. Results indicate that the best PCA-based 
method (using all eigenvectors) produces an average classification performance equal to 
79.22%. Incorporated with PCA for dimension reduction, the LDA-based method 
achieves 94.58% accuracy in average classification performance. Additional testing on 
unfocused images produces no significant impact on the overall classification perform- 
ance. Overall results again confirm uncooled IR imaging can be used to identify individ- 


ual subjects in a constrained indoor environment. 
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EXECUTIVE SUMMARY 


This study investigated face recognition by using uncooled infrared camera with 
an expanded database. A database consisted of 420 facial images obtained from 14 vol- 
unteers was available from a previous study. An additional 1080 images from 36 volun- 
teers were included in the expanded database, resulting in a total of 1500 images. Each 
subject was required to perform three different facial expressions with 10 head different 
orientations. Facial expressions considered were neutral sitting, smiling, and pronouncing 
the vowel “u”. The distance between the subject and the camera was kept constant while 
permitting a vertical and horizontal angle freedom of 10°. In addition, 36 were collected 


with an intentionally unfocused camera lens for additional analysis. 


An automatic image cropping technique was developed to accommodate the ex- 
panded database. The camera generates 160x120 pixels for each image, and the images 
size was then reduced to 6045 pixels by the automatic cropping technique. This study 
used two linear schemes to investigate infrared imaging for face recognition. The first 
linear scheme considered was the Principal Component Analysis (PCA). The second 
linearity approach was the Fisher Linear Discriminant Analysis incorporated with the 
PCA for dimension reduction and classification. A minimum distance classifier was cho- 
sen to determine classification decisions. Different PCA-based and LDA schemes were 
compared by so-called k-fold cross validation, which used 60% of the images for training 


and the remaining 40% for testing. 


Results indicate that the LDA is far superior to the PCA-based classification algo- 
rithm by at least 13% with the expanded database. Overall results show that uncooled in- 


frared imagery can be an efficient and reliable tool for face recognition and classification. 
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I. INTRODUCTION 


The major advantage of infrared (IR) or thermal imaging is its robustness to illu- 
mination changes as it is only subject to emitted radiations from an object. Thermal imag- 
ing can detect, identify, and evaluate thermal-related problems in matter of seconds and 
has been widely used in military applications as described later. It is also a non-invasive 
diagnostic device, which requires no physical contact with investigated targets. Over the 
years, IR imaging has been applied to various face recognition applications; however, it 
still remains unpopular due to its high cost in equipment and maintenance. Recent devel- 
opments in uncooled IR technology with enhance image resolution and reduce equipment 
and maintenance costs provided an opportunity to apply this technology for face recogni- 
tion. This study expands on an earlier study that investigated uncooled infrared imaging 


in face recognition and classification applications [1]. 


A. INFRARED TECHNOLOGY BACKGROUND 

Sir William Herschel, an astronomer, discovered infrared in 1800 [2]. Knowing 
that sunlight was made up of all the colors of the spectrum, and that it was also a source 
of heat, Herschel wanted to find out which color(s) were responsible for heating objects. 
He devised an experiment using a prism, paperboard, and thermometers with blackened 
bulbs where he measured the temperatures of the different colors. Herschel observed an 
increase in temperature as he moved the thermometer from violet to red in the rainbow 
created by sunlight passing through the prism, and found that the hottest temperature was 
actually beyond red light. The radiation causing this heating was not visible; Herschel 
termed this invisible radiation "calorific rays." Nowadays, we called Herschel’s discovery 


as infrared. Today, infrared technology has tremendous roles in science and engineering. 


B. THE INFRARED CAMERA 
Infrared radiation is electromagnetic radiation whose wavelengths are greater than 
those of visible light but shorter than those of microwaves. It is radiated heat invisible to 


human eye, yet can be sensed by our skin. All objects emit infrared radiation regardless 


of their temperature. The intensity of the radiated heat is proportional to the fourth power 
of the absolute temperature of the object. It also depends upon emissivity which is a ma- 
terial property of the object. An ideal infrared emitter, said to be a “blackbody,” has an 
emissivity of unity. Most real objects have emissivities less than unity, and therefore emit 
less intense infrared radiation than a blackbody at the same temperature does. In sum- 
mary, temperature and emissivity characterize the infrared emitting properties of an ob- 


ject [3]. 


Applying the infrared radiation concept, an infrared camera simply detects and 
converts heat to electrical signal. Resulting electrical signals are then processed to pro- 
duce a thermal image on a video monitor and to perform temperature calculation [4]. The 
infrared camera can accurately deliver, identify, and evaluate thermal information. For 
example, firefighters use thermal imaging to effectively and efficiently locate the origin 
of the fire and to save many lives. Figure 1 demonstrates an infrared camera’s robustness 


to illumination changes. 





Figure 1. Thermal Image Taken under Complete Darkness (Left) and under Room 
Light (Right) (From Ref. 9.). 


C, THE MEASURE OF TEMPERATURE 

Thermal imaging can be used for measuring temperature of an object remotely. 
For example, infrared cameras with build-in temperature measurement capabilities pro- 
vide the needed information for electrical and mechanical equipment operating in an op- 
timal condition. Infrared cameras have ability to detect some abnormalities, which are of- 


ten invisible to naked eyes. 


D. INFRARED TECHNOLOGY APPLICATIONS 

Infrared technology has a wide variety of applications in both military and civilian 
industries. Thermal imaging is a critical piece of equipment in today’s so called “high 
tech military as it is totally independent of visible light and can be used in daylight or un- 
der complete darkness [5].” Tanks such as the M1A1 Abrams main battle tank have driv- 
ing systems that rely on thermal infrared imaging to navigate any terrain during day or 
night. Apache Helicopters have forward-looking infrared (FLIR) units mounted on them 
that can give the pilot a thermal picture a mile ahead of the aircraft. Fighter planes (F-16, 
F-14, F-15, F-18, stealth fighter, Tomcat, etc.) use thermal infrared imaging to locate and 
control fire to specific targets. Even some of the missiles being fired at the Iraqi's are 


guided with infrared technology (heat seeking missile) [5]. 


As mentioned earlier, firefighters use thermal imaging effectively to fight fires. It 
has been proven many times over that firefighters equipped with thermal imagers help 
save more lives and preserve property. Being able to find trapped victims through dense 
smoke or darkness is the most obvious application for thermal imaging, but the possible 
applications are endless. A thermal imager can contribute in a variety of unique ways and 


become an indispensable firefighting tool [6]. 


Needless to say, infrared technology has a wide range of applications. Recent de- 
velopment allows affordable thermal imaging for various applications, which have tre- 


mendous impact in our society. 


E. THE UNCOOLED INFRARED CAMERA 

Historically, infrared cameras have used sensors made of materials that require 
cooling to a temperature equivalent to liquid nitrogen’s (77 K) [7]. The cost of making 
the “cooled” infrared camera is extremely expensive. Through research studies, the pro- 
duction of uncooled high-performance detectors capable of sensing and measuring infra- 
red energy finally has become available. The key technology used in such cameras is the 
microbolometer focal plane array, originally developed and patented by Honeywell. The 


microbolometer arrays are fabricated using standard silicon technology coupled with mi- 


cromachining. Each detector incorporates a monolithic vanadium oxide and silicon mi- 
crostructure on top of a simplified CMOS read-out integrated circuit. The array used in 
the present camera (IR-160 manufactured by Infrared Solutions) has 19,200 (160x120) 
individual detectors, centered on a 51 um pitch grid. The normal response time of the mi- 
crobolometer is about 12 ms, enabling operation at 30 Hz. The measured nominal NETD 
(Noise Equivalent Temperature Difference) is about 50 mK at 30 °C, with a f/0.8 lens. 
The array structure is tuned for maximum performance in the 8x10 m to 12x10° m 


waveband [7]. 


With multiple data output options, the IR-160 is extremely flexible and can be in- 
tegrated into a wide range of applications. The IR-160 is low cost and measures 
4.3x3.9x4.2 inches (WxHxD) including a 20-mm lens. The IR-160 engine weighs 
less than 5.0 oz and measures just 3.0x3.0x1.5 inches (WxHxD). An 8-bit video im- 
age via the RS-232 connection allows for real time data transmission [8]. In addition, the 
camera has built-in germanium lens, which allows for manual focusing. Further, the cam- 
era can be connected to the Hyper Terminal program, which allows users to remotely op- 


erate the camera [1]. 


The purpose of this study is to expand on an earlier study that investigated un- 
cooled infrared imaging in face recognition and classification applications by using 1500 


infrared images obtained from 50 volunteers. 


This chapter introduced the background information, applications, and current de- 
velopment of infrared imaging. In addition, this chapter explained the reasons for select- 
ing the specific uncooled infrared camera for our study. Chapter II describes the overall 
system set-up from the camera to the desktop computer. Chapter III presents the image 
acquisition process, the image files nomenclature, and the automatic image processing. 
Chapter IV examines the two linear approaches considered, namely the Principal Com- 
ponent Analysis (PCA) and Fisher Linear Discriminant Analysis (LDA) used for face 
classification. Examples are given to demonstrate the use of both PCA and LDA, as well 
as compare of the two approaches. In addition, this chapter introduces the minimum dis- 
tance classifier selected for the classification step. Chapter V describes the k-fold Cross 


Validation implemented and results obtained. Chapter VI presents the conclusions and 
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recommendations for future study in infrared face recognition. Appendix A includes all 
MATLAB codes and algorithms implemented in the study. Finally, Appendix B includes 


all simulation results in spreadsheet format. 
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Il. THESYSTEM SET-UP 


This chapter describes the overall face recognition system set-up including the in- 
frared camera and interfacing it with the desktop computer. First, we discuss the basic 
scheme used in this study followed by the software and hardware components, and the 


choice of equipment. 


A. THE BASIC SCHEME 

Our overall goal was to collect sufficient data and to use the collected images for 
training and testing purposes. First, volunteers were required in order to collect thermal 
images. Next, stored images were cropped and processed for dimension reduction. The 
processed images retained only the features used for classification. Finally, the database 
was divided into testing and training sets. Figure 2 depicts the overall face recognition 


procedure. 
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Figure 2. Overall Face Recognition Scheme (From Ref. 9.). 


1. Training Stage 

The face recognition training stage is illustrated in Figure 3. Training images are 
used as reference images to perform facial classification. First, all cropped training im- 
ages are loaded into one data matrix. Next, the projection matrix generated using this data 
matrix projects the data onto a smaller dimensional feature space. The projected matrix 


now becomes the reference database and is ready for classification. 
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Figure 3. Training Stage of Face Recognition (From Ref. 9.). 


2. Testing Stage 

Figure 4 illustrates the face recognition testing stage. Upon obtaining the training 
data, testing images are transformed in the same fashion as the training data was. Simi- 
larly, testing images are loaded into a single matrix and projected into the smaller dimen- 
sional feature space. To classify the “testing” set from the “training” information, we 
simply look for the smallest distance of the testing data from the centroids of all the train- 


ing data in the projected feature space. 
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Figure 4. Testing Stage of Face Recognition (From Ref. 9.). 


B. SOFTWARE COMPONENTS 


As described earlier in Chapter I, the IR-160 uncooled infrared camera from In- 
frared Solutions Inc. was chosen for the study due its low cost and high performance. All 
images taken from the cameras are stored in .pgm format. Infrared Solutions Inc. provides 
the software program called “infraview,” which can be downloaded from the company 


website to view the images in .pgm format. 


Table 1 presents the setting of the HyperTerminal program used in this study. The 
communication program called HyperTerminal from Windows remotely handles all com- 
mands and operations of the IR-160. Users are required to set up the program prior to ini- 


tiate any operations. 























Parameters Setting 
Connecting using COM1 
Bits per second 115200 
Data Bits 8 

Parity None 
Stop Bit 1 

Flow Control Hardware 














Table 1. | HyperTerminal Program Settings (From Ref. 10.). 


Apart from the operations of the infrared camera, MATLAB 6.5 performs all image con- 


versions, processing and storage, and mathematical algorithms. 


C. HARDWARE COMPONENTS 
Figure 5 displays the hardware components of the entire face recognition system. 

The entire face recognition system requires an infrared camera for image acquisition, a 
TV monitor for real-time image display, and a desktop computer for image processing. In 
order to accurately and efficiently obtain subject images, the ThermaCAM monitor was 
selected for real-time image display, which allows the user to position the subjects cor- 
rectly during the data collection process. Similar to the infrared camera, the ThermaCAM 
monitor connects to the host computer over an RS232 cable. The ThermaCAM is com- 


manded to periodically send out packets of temperature data until told to stop doing so. 
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Figure 5. The Hardware Components of Face Recognition System (From Ref. 9.). 


This chapter described the overall system set-up used for face recognition. Next, 


Chapter III presents the data acquisition procedures from photo shooting to file naming. 
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Ul. DATA ACQUISITION 


This chapter describes the image acquisition procedures and file nomenclature 
conventions. In addition, it presents the automatic image cropping schemes and image 


storing and uploading procedures developed for our study. 


A. IMAGE COLLECTION 

In order to provide a good set of image data for classification, images must be 
taken consistently from each individual subject. Hence, it is necessary to have a well- 
controlled environment to achieve the objective. Parameters under consideration include 
distances between the infrared camera and subject’s face, the height of the camera, and 
the position of the face. In addition, different facial expressions are required to provide 


various schemes for examining classification methods. The image acquisition scheme is 


Marks on 
— 


depicted in Figure 6 and 7. 






37 cm 


IR camera 


Figure 6. Lateral View Infrared Camera Set-Up (After Ref. 1.). 











IR camera 





Figure 7. Front View Infrared Camera Set-Up (From Ref. 1.). 
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Figure 6 is the lateral view of the camera and its distance to the subject. Since the 
size of the subject’s face varies from person to person, it is necessary for the infrared 
camera to be positioned far enough to cover the entire facial structure but not too far 
to loose the subject’s facial characteristic. Under pre-defined conditions, the distance 
was set at 90 cm between the camera and the subject’s forehead. The height of the 
center of the camera lens was set at 118 cm from the ground. Next, subjects were re- 
quired to rotate their head toward 10 different directions to introduce variability in the 
images collected. Figure 7 depicts the front view of the camera as if the subject 
looked straight ahead. There were nine numbered points (vary from | to 9) marked on 
the wall. Each subject was asked to turn his or her head toward each number. An ad- 
ditional picture was taken by asking the subjects to look at a random place within the 
square formed from the extreme marks [1]. Subjects wearing glasses were asked to 
remove them to limit variations among subjects. Finally, each subject was asked to 
take an additional picture by looking at the center of the camera lens, which was in- 
tentionally unfocused, to investigate the classification scheme robustness to image 


quality variations. Figure 8 shows a sample of images obtained from the infrared 


camera. 


va | 
i! 
ae 
wa 
ie! 
a 
ioe! 
“ 
ya 


— 


BEDS Sos 
a 
SECECEEEEE 
cl dd ed 
mf dd dd 
a [2 Fa ps | he fe | Fo 
0 [2 ea | hs Fhe} 
co id od Ot ad 
cl i dd ad 


JESSes 


2 
cl 
[i 
Fl 
Ra 
pi? 
if 
IE 
Tl 


w 


oo 
mr) 
— 
VY 


Figure 8. Sample of Infrared Images (From Ref. 
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B. IMAGE NOMENCLATURE 

The nomenclature of the images taken from the infrared camera follows that al- 
ready used in the earlier study [1]: xx-yy-zz.pgm, where xx is the subject classification 
number, yy is the facial orientation corresponding to the marked number on the wall 
(Figure 7), zz is the facial expression, and pgm is the build-in file format from the infrared 


camera [7]. The range of the selected parameters is the following: 
e [1-—6, 8,9, 11— 16, 50— 85] for xx; 
e [1-10] for yy; 
e [1, 4,5, 6] for zz. 

The facial expression number zz is further defined as follows: 

1- a neutral expression; 

4- a neutral expression with unfocused lens; 


5- a smiling expression; 


66,599 


6- a pronouncing vowel “u” expression. 


For example, 51-5-6.pgm represents the subject number 51 with head orientating at 
marked position number 5 (straight head) while pronouncing “u.” A total of 50 subjects 
and 31 thermal images each were included in the full database, resulting in a total of 1550 


images. 


C. IMAGE CROPPING 

Infrared images obtained from the IR-160 camera have dimensions equal to 
160x120 pixels. To avoid the potential effects due to different backgrounds, all images 
were cropped to isolate the face-only portions [1]. The resulting cropped images were 
later used in training and testing stages of the face recognition system. Due to the fact 
that there were a significant number of images used for the study, an automatic and effi- 
cient method of cropping was required to save time and computational costs. Many stud- 
ies use manual cropping techniques, which require visual inspection and cropping image 


by image. Others use very complicated automatic methods, which require tremendous 
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computational costs in addition to the face classification algorithm. The method used for 


this study is very simple and efficient. 


The two-step automatic cropping method has the overall objective of reducing the 
size of the image matrix, while still retaining the useful information for face recognition. 
First, it eliminates top and bottom image sections which contain background or neck ar- 
eas and are not useful. Then, it applies similar procedures to crop off the excessive left 
and right portions. The resulting cropped image only retains the facial characteristics and 
has dimensions equal to 60x45 pixels. Figure 9 depicts a raw infrared image obtained 


from one of the subjects. 





Figure 9. Uncropped Infrared Image with Dimension 160120 Pixels. 


1. “Top and Bottom” Cropping Scheme 

Figure 10 summaries the “top and bottom” cropping method. The “top and bot- 
tom” automatic cropping scheme retains the facial characteristics below the eyebrows 
and above the chin. The resulting cropped image reduces the image vertical dimension 
from 120 to 60 pixels. First, the camera is positioned so that images collected contain 
only information right above the chin. The MATLAB function “FIND” located the ma- 
trix element with elements larger than the threshold value or threshold intensity. In our 
case, the threshold value is “1,” and the background elements have intensity level less 
than that. Since MATLAB is column-orientated, the image is rotated 90 degrees counter- 
clockwise before applying the MATLAB function FIND. The FIND function locates the 
top of the head. The FIND function actually locates the index (ices) of the elements that 
have intensity greater than the threshold value of “1.” Since the top of the subject’ head is 
rounded, the FIND function simply returns the index (ices) of very top element(s) it 


reaches first. The background pixels are then cropped. The image now only has informa- 


16 


tion below the top of the head. Next, we vertically divided the image into half and located 
the center-line of the image. Taking the center-line of the image as reference, we manu- 
ally removed all information above the eyebrows. Finally, the image is rotated back to the 
original vertical orientation. The resulting image now has dimensions of 16060 pixels. 
The above cropping method only performs once for one image of each subject and the 
settings automatically apply to all other images of the same class. The MATLAB func- 


tion top.m was used to perform top and bottom cropping and is included in the Appendix. 


rotate 90 degrees 
counterclockwise 





crop top 
portion 


crop top and 
bottom portions 





rotate 270 degrees 
counterclockwise 





Figure 10. Top and Bottom Cropping Method with Dimension 16060 Pixels. 
iW 


2. Side Cropping Scheme 

Figure 11 summarizes the side cropping technique, which operates in a similar 
fashion as the “top and bottom” method. From the “top and bottom” technique, the image 
has 160x60 pixels. Using the FIND function to locate the leftmost elements of the image 
matrix, the backgrounds of the left portion are completely eliminated. Then the image is 
“flipped” to produce the mirror image by using MATLAB function “FLIP.” Next, the 
image is cropped, flipped back to the original orientation, and calibrated to retain only fa- 
cial elements used for face recognition. The size of the final cropped image is 60x45 


pixels. 


crop left portion 





flip 


crop left portion 





crop both sides 


Figure 11. Side Cropping Method with Dimension 60x45 Pixels. 
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D. DIMENSION REDUCTION (EIGENFACES) 

Studies have shown that the Principal Component Analysis (PCA) can be success- 
fully used for data compression. PCA is a linear projection scheme that reduces the di- 
mensionality of a data set while retaining as much variance present in the data as possible 
[11]. The background information and basic concept is covered in the next chapter. This 


section describes the use of PCA for dimension reduction. 


Since there are total 1550 images in our database, the database size can potentially 
pose many computational problems for desktop computers or workstations due to the size 
of the matrices involved in the computations. PCA provides an alternative approach to 
reduce the dimension of the data matrix and results in lower computational costs. As 
mentioned above, PCA is merely a linear projection scheme, which aims at eliminating 
unnecessary or excessive information while keeping the crucial information for data 
compression purposes. The PCA concept has been applied to face recognition with suc- 
cess both in visible and IR imaging [17, 18]. Images projected into smaller dimensional 
space are called eigenfaces. Figure 12 and 13 show a sample of the cropped images and 
eigenfaces obtained with the PCA algorithm. Note that Figure 13 has only 63 eigenfaces 
vice the original 100 training images due to the dimensionality reduction operation. Fig- 


ure 14 presents the mean image obtained from all the training images. 
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A Sample of Cropped Images. 
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A Sample of Eigenfaces. 


Figure 13. 


Training Data Mean Image. 


Figure 14. 
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E. IMAGE STORAGE AND UPLOAD 

Overall image collection and storage operations are performed in a systematic 
fashion. First, raw collected images are stored in the .pgm format and assigned class 
number identifiers, orientation and section numbers according to the nomenclature de- 
scribed earlier in Section B of this chapter. Next, we apply the automatic cropping proc- 
ess and store cropped images in the .bmp format. We add the extension “-a” to each 
cropped image file to distinguish it from the raw image. Finally, all cropped images are 
reshaped columnwise and stored into a single matrix for algorithm implementation. Since 
each cropped image has a size equal to60.x 45 , the associated reshape column vector has 
a size equal to 27001. Hence the resulting data matrix has a size equal to 2700x1500 
excluding the unfocused images, which are stored separately. The data matrix is stored in 
the A_all.mat file. The MATLAB function “load_any_img to matrix” allows the user to 
upload any image into a single matrix for computations. The MATLAB function 


“load_any_img to matrix” file is included in the Appendix. 


This chapter covered data acquisition procedures and image file nomenclature 
conventions. It also described the automatic cropping, dimensionality reduction, and data 
storage and upload schemes implemented. The next chapter presents the application of 
the Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) in 


face recognition. 
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IV. CLASSIFICATION 


This chapter presents the application of the Principal Component Analysis (PCA) 
and the Fisher Linear Discriminant Analysis, which also incorporates a PCA step, for 
face recognition. Over the years, both linear methods have widely been used in various 


face recognition and pattern classification studies with success. 


A. PRINCIPAL COMPONENT ANALYSIS (PCA) METHOD 

Karl Pearson introduced Principal Component Analysis in 1901 in the analysis of 
intelligence tests [11]. The basic concept of PCA is to describe the variation of a set of 
multivariate data in terms of linearly independent (uncorrelated) variables, which are a 
particular linear combination of the original variables. The new variables are recon- 
structed in decreasing order of importance. For example, the first principal component 
measures as much as possible of the variation present in the original data. The overall 
goal behind the PCA is to determine the least amount of components needed for measur- 
ing most of the variation in the data set. These components are then used to represent the 
original data with little or no loss if information, thus providing a reduction in the dimen- 


sionality of the original data and greatly simplifying the analysis [12]. 


1. Introduction to PCA 

Geometrically PCA is a multivariate procedure, which rotates the data such that 
maximum variabilities are projected onto the axes [13]. Essentially, a set of correlated 
variables is transformed into a set of uncorrelated variables, which are ordered by reduc- 
ing variability. The uncorrelated variables are linear combinations of the original vari- 
ables, and the last of these variables can be removed with minimum loss of real data in- 


formation. 

Let’s consider the problem of representing n, k-dimensional samples of image 
vectors {x,,...,x, }, where 7 is the number of images (1500), and k is the size of each im- 
age presented columnwise (60x45 = 2700 ). All images are stacked columnwise in a 


singe matrix. 
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The first principal component is the combination of variables that explains the 


greatest amount of variation. Mathematically, the first principal component y, of the ob- 


servations is the linear combination of the original variables and can be written as: 
Yi Syd ct Gide test Gis (4.1) 


whose sample variance is greatest for all coefficients, a,,,d,,,...,@;, (which may be writ- 


ten as the column vector a, = (q,,,4,5,..-,4,,)' ). Since the variance of y; could be in- 


creased without limit, a restriction is placed on these coefficients; as becomes apparent 


later a sensible constraint is to require that the sum-of-squares of the coefficients, i.e. 


a,'a,, should be set to unity. 


The second principal component defines the next largest amount of variation and 


is independent to the first principal component. In other words, y, is the linear combina- 
tion 


Y= Gay hgh ha Xs (4.2) 


ie, y,=a,'x (where x =(x,,x,,...,x,) ), which has the greatest variance subject to the 


two conditions: a,’a, =1 and a,'a, =0 (so that y, and y, are uncorrelated). Similarly the 


7" principal component is that linear combination 
iE 
yj = a, x, (4.3) 
which has greatest variance subject to a os a,=landa i a,=0 forall i<j. To find the 


coefficients defining the first principal component, the elements of a, must be chosen to 


maximize the variance of y, , subject to constraint a,’a, =1. The variance of y, is given 


by 
Var(y,) = Var(a,' x) 
(4.4) 


Since a’ (x -E (x)) is scalar quantity, the above equation may be rewritten as 
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Var(y,) = E| (a," (x -E(x))(x—E(x))"a,) | 
=a,"B| (x -E(x))(x-E(x))"a, | (4.5) 


=a,Sa,. 


where S = E| (x —E(x))(x-E(x))* ] is the covariance matrix of the original variables. 


Note that it is customary to calculate principal components so they have zero mean. 


Hence, S =E [ <x" | , becomes the data correlation matrix. 


Applying Lagrange multipliers to this maximization problem leads to the solution 


that a, is the eigenvector of S corresponding to the largest eigenvalue and, in general, 
the j" principal component is defined by the eigenvector associated with the /" largest ei- 
genvalue. 

If the eigenvalues of S are/,,A, ...,4,, then, by choosing a ja ; =1, the variance 
of the j‘" component is therefore given by A, . For example y; has variance given by (4.5) 


and, since a, is an eigenvector of S , 
Sa, =Aa,. (4.6) 
So, (4.5) may be written as 


Var(y,) = a,' Aa, 
= ha,‘ a, (4.7) 
= A> 


wherea,’a, =1. 


Finally, PCA can be viewed as a rotation of the existing axes to new positions in 
the space defined by the original variables, where there is no correlation between the new 
variables defined by the rotation. The first new variable contains the maximum amount of 
variation; the second new variable contains the maximum amount of variation unex- 
plained by the first and orthogonal to the first. The rest of the new variables behave the 
same way. 


2. Snapshot Method 
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As described earlier in Chapter V, each image contained in the training set is re- 
shaped as a column vector with length 4, and the data matrix A is defined as the concate- 
nation of all image vector columnwise, resulting in a k x n matrix, where k is the dimen- 
sion of each image and nv is the number of training images, respectively [1]. For our 
study, we use 60% of the data images as training data, which corresponds to a 
2700x900 data matrix. Note that the mean image of the training set is subtracted from 


each image resulting in a matrix X . The data correlation matrix § is defined as follows: 
S= XX". (4.8) 


The correlation matrix § may have a large dimension, depending on the size of 
the problem. In our case, we used 60% of the database as the training set, which corre- 
sponds to 900 cropped images (each of size 60x45). The resulting X matrix has size equal 
to 2700x900 resulting in a correlation matrix S of size 2700x2700. As a result, comput- 
ing the covariance, eigenvectors and eigenvalues can be quite computationally expensive. 
Note that for a matrix the maximum number of non-zero eigenvectors obtained 
from S = XX" , where X is of dimension k x n, is equal to min(k,n) [15]. Since the 
number of training images (7) is usually smaller than the number of pixels (4), the maxi- 
mum number of non-zero eigenvalues and associated eigenvectors is n—1. Recall that 
the non-zero eigenvalues of XX" and X'X are the same [14]. Furthermore, the eigen- 
vectors associated with the non-zero eigenvalues of XX" are the same as the eigenvec- 
tors of XX multiplied by the matrix X and normalized [14]. As a result, the Snapshot 
method can be used to create the eigenspace from a nXn matrix rather than ak xk co- 


variance matrix, as previously shown in [15]. 


Let k and n be the image dimension (2700) and the number of training images 
(900), respectively. The singular value decomposition (SVD) of the mxn real data ma- 
trix X is the factorization 
X =USV", (4.9) 


nxn 


where Ue R™ is unitary, Ve R”” is unitary, and Ye R” is diagonal. In addition, the 
diagonal entries o (singular values) of 2 are nonnegative and can be stored in non- 
increasing order; that is, 0, >O, >--->o, 20, where p =min(‘,7) [14]. Recall that the 
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nonzero singular values of the matrix X are the square roots of the nonzero eigenvalues 


of XX . Note that 
X'X =(UZV")' (UZV") =VE"U'UEV' =V(E"Z Vv". (4.10) 


Equation (4.10) indicates that the matrix V contains the eigenvector of X’X and the ei- 
genvalues of XX are the diagonal elements of the matrix X’X [16]. Further, note that 


XX" may be expressed as 
XX? =(USV")(VE"U") =UES(V'V)E"U! =USE'U". (4.11) 


Equation (4.11) indicates that the matrices U and £* contain the eigenvectors and ei- 
genvalues of XX" , respectively. Therefore, both XX’ and XX have the same n non- 


zero eigenvalues with n— p additional zero eigenvalues ifn > p. 


Next, consider the product of X andV, 
XV =UZV'V =US , (4.12) 
where the last step follows because V is unitary. In addition, the eigenvectors associated 


with nonzero eigenvalues are given by: 


U=Xvs"', (4.13) 


-l 


where 2 is defined as the inverse of the portion of 2 that contains the nonzero diago- 


nal elements of 2 only [14]. 


Therefore, the left eigenvectors associated to non-zero eigenvalues of the poten- 
tially very large nxn covariance matrix XX" may be computed by computing SVD of 


the smaller dimensional kxk covariance matrix XX. 
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3: Projectors 

The idea to use projection matrices or projectors is to project a set of data into a 
smaller subspace for data analysis. As mentioned in Chapter III, the projection matrix is 
created from the data matrix, which is then projected onto the feature space. PCA per- 
forms dimensionality reduction by using a projection matrix composed of eigenvectors 
selected from the eigenvector matrix U (also called “eigenfaces” in face recognition ap- 


plications) to project the original data into a lower dimensional subspace. 
A projector is a square matrix P that satisfies 
P’=P. (4.14) 


This definition includes both orthogonal and non-orthogonal projectors [14]. Figure 15 il- 
lustrates the example for a generic non-orthogonal projection. The term projector might 
be thought of as rising from the notion that if one were to shrine a light onto the subspace 


range ( P ) from just the right direction, then Po would be the shadow projected by the 


vector v . Observe that applying the projector results in v itself ify e range(P). Mathe- 


matically, we have v= Px for some x and 


Po=P’x=Px=v. (4.15) 





range(P) % 


Figure 15. A Non-orthogonal Projection. 
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Applying the projector to this vector gives a zero result: 
P(Pv-v) = P*v— Pv=0. (4.16) 
This equation means that Pv —ve null(P). That is, the direction of the light may be dif- 


ferent for different , but is always described by a vector innull(P). 


The orthogonal projector illustrated in Figure 16 is one that projects onto a sub- 


space S, alongS,, where S, and S, are orthogonal. From now on, we mainly deal with 


the orthogonal projectors. 





Figure 16. An Orthogonal Projection. 


4. An Example of PCA in Dimensionality Reduction 

The following example demonstrates the use of PCA in dimensionality reduction 
[13]. In this example, we take a simple set of two-dimensional data and apply PCA to de- 
termine the principal axes. Although the technique can be used with higher dimensional 


data, two-dimensional data will make it simpler to visualize. 


Figure 17 shows a plot of the 100x2 data matrix X , which corresponds to a 


training data containing two images with 10X10 pixels each: 
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3.072 -1.7988 
5.6931 3.3327 
2.5530 1.4945 

DY], (4.17) 
~0.5495 —0.3217 
3.0858 1.8064 
~2.8885 —1.6909 








Figure 17. A Scatter Plot of Data Matrix X . 


Recall from the Snapshot method, the correlation matrix was first computed as follows: 


~~, (5.9823 3.5019 
(4.18) 


~ (3.5019 2.0500 
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Next PCA is performed and the principal components were calculated: 


a, 4) (-0.8630 -0.5052 (4.19) 
dy, 4) \-0.5052 0.8630 } 


Figure 18 shows the first and second principal components axes plotted on top of 
the scatter matrix X . The red and green lines represent the direction of the first and sec- 
ond principal components, respectively. Note how the first principal component lies 
along the line of greatest variation, and the second lies perpendicular to it. Where there 
are more than two dimensions, the second component will be perpendicular to the first 


and along the line of next greatest variation. 





Figure 18. First and Second Principal Components. 


a1 


Using the projector concept, the original data set was multiplied by the principal 
components. The data was rotated and laid along the direction of the first principal com- 


ponent. The result is illustrated in Figure 19. 





5 -4 -2 0 2 4 6 


Figure 19. Data lies along with the first Principal Component. 


The most common use for the Principal Component Analysis is to reduce the di- 
mensionality of the data while retaining the most information. Figure 20 shows that all 
the data are projected the direction of the first principal component, thus reducing dimen- 
sionality. However, the Principal Component Analysis actually smears the classes to- 


gether so that they are no longer linearly separable in the projected space [17]. 
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Figure 20. Data Projected onto One Dimension along the First Principal Component. 


=F An Example of PCA in Classification 


Given two sets of cluster data as shown in Figure 21, we applied the Principal 
Component Analysis algorithm. In this case each set of clusters is “well” separated from 
one another which should allow the PCA to classify the two clusters. The green line 


represents the direction of the first principal component. 
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Figure 21. The Principal Component Analysis in Two “Well” Separated Clusters. 


The Principal Component Analysis works well for the above example since the 


clusters are “well” separated and easy to classify. 


B. LINEAR DISCRIMINANT ANALYSIS (LDA) METHOD 
Today, the most widely used method for face recognition is Fisher’s Linear Dis- 
criminant Analysis or Linear Discriminant Analysis (LDA). First, we introduce the two- 


class LDA and next extend to the C-class LDA approach. 


1. Two-Class LDA 

The objective of the Linear Discriminant Analysis is to perform dimensionality 
reduction while preserving as much of the class discriminatory information as possible 
[19]. First, we examine the simplest case of the Linear Discriminant Analysis with two 


classes, and we later generalize to the C-class Linear Discriminant Analysis. 
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Assume we have a set of D-dimensional samples {x,, x, ...,x,,}, Where is the 
number of samples, N, of which belong to classC,, and N,,to class C, . Let’s consider a 
linear transformation mapping the original n-dimensional image space into an m- 


m 


dimensional feature space, wherem <n. The new feature vectors ye R” are defined by 


the following linear transformation: 
y=w'x, (4.20) 


where we R”” is a matrix with orthonormal columns [17]. Of all possible lines we 
would like to select that which best maximizes the separability of the scalars, as illus- 


trated in Figure 22. 


Xo Xo 


X4 X4 
Figure 22. PCA Projection Direction (left graph), LDA Projection Direction (right 
graph): LDA Projection Maximize Data Separability (From Ref. 20.). 


To find a good projection vector w , we have to define a measure of separation be- 


tween the projected classes. The mean vector of each class in x and y feature spaces is 


defined as 
2 x (4.21) 
i N, xeC, 
and 
m= Yi yey weap, (4.22) 
N, yeC, N, xeC; 


We then choose the distance between the projected means as our cost function: 
J(w) =|, — a3] =|" (a —#,)]- (4.23) 


However, the distance between the projected means is not a very good measure since it 
does not take into account the standard deviation within the classes as illustrated in Fig- 


ure 23. 


This axis yields better class separability —t { ais 1 eg 


' 
a a a -- = 


et RS X, 
This axis has a larger distance befween means 





Figure 23. Projected Means onto Cartesian Coordinates (From Ref. 20.). 


The solution proposed by Fisher is to maximize a function that represents the dis- 
tance between the means, normalized by a measure of the within-class scatter. For each 


class we define the scatter, an equivalent of the variance, as 


5 =Y0-ny, (4.24) 


ye, 


where the quantity (S, re S,) is called the within-class scatter. The Fisher linear dis- 


criminant is defined as the linear function w’x that maximizes the criterion function: 


Hl 
Jw) = (4.25) 
S.+S, 


AS Pall 





Therefore, we look for a projection where scatters from the same class are projected very 
close to each other and, at the same time, the projected means are as farther apart as pos- 


sible. Figure 24 depicts the projected within class scatters and class means. 


36 





Figure 24. Projected Within Scatters and Class Means (From Ref. 20.). 


In order to find the optimum projection w , we need to express J(w) as an ex- 


plicit function of w . We define a measure of the scatter in multivariate feature space x, 


which are scatter matrices: 


S,= > (x-4)(x- 4)’, (4.26) 


xeC; 
and 
S,+S,=S,, (4.27) 


where S\, is called the within-class scatter matrix. The scatter of the projection y can 
then be expressed as a function of the scatter matrix in feature space x : 
S= D-H =D wx wn, = 0 w" x wx," w=" Sw, (4.28) 
ye; xeC; xeC; 
and 
S +S, =w'Sw. (4.29) 


Similarly, the difference between projected means can be expressed in term of the means 


in the original feature space as: 


(Hm, — ,) = (w" pw, — Ww" WY = Ww" (My — Wy) WW" Sw, (4.30) 
FS eS 
Sp 
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where the matrix S', is called the between-class scatter. Note that, since §,, is the outer 


product of two vectors, its rank is at most one. We can finally express the Fisher criterion 


in terms of S|, and S, as: 


J(w)= Ww Saw (4.31) 





7 
w Sw 


To find the maximum of J(w) we derive and equate to zero: 


d d|w'S,w 
HONS py ot Eee g 
a. )] el 





[ws] <[w" sw ]-[ ws, |<" 0] =0 (4.32) 


[ w" Sw ]28,w—| wS,w ]28,.w =0. 


Dividing byw’ Sw, 








T ve 
dane a Se td S wo) 
w Sw S 


=> S,w—JS,w=0, 
>S,'S,-Jw=0. (4.33) 
Solving the generalized eigenvalue problem (S\,'S,,w = Jw) yields 


T 
w = aga) SE = 5 —y,). (4.34) 
» |wSw 





Ww 


This above equation is known as Fisher’s Linear Discriminant, although it is not a dis- 
criminant but rather a specific choice of direction for the projection of the data down to 


one dimension. 
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2. C-Class LDA 


We now generalize Fisher’s LDA to the C-class solution. Instead of one projec- 


tion y, we seek (C—1) projections [| y,,.,...,¥¢_,] by means of (C —1) projection vec- 


tors w,,, which can be arranged by columns into the projection matrix 


Wo [», |W, |W] : 


where i=1,2,---C-1. 
The set of (C—1) projections may be rewritten in a matrix form as: 
y — i eaeexien veal = W'x : 


The generalization of the within-class scatter is 


where S, = >) (x—y,)(x—y,)' andy, 25) x. 


xeC,; i xEC; 


The generalization of the within-class scatter to the C-Class problem is given by 


S,= > N(u,- wu, - 


i=l 


where w= x=) NH; 


xeC; 
We also define S, as the total scatter matrix, which is the sum of §, and S,,: 
S,=S,+S,. 


Figure 25 illustrates the between-class and within-class scatters. 
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(4.35) 


(4.36) 


(4.37) 


(4.38) 


(4.39) 





x 


Figure 25. Within-Class (.S\, ) and Between-Class Scatters (8, ) (From Ref. 20.). 
Recall that we are now looking for a projection that maximizes the ratio of between-class 
to within-class scatter. Since the projection is no longer a scalar (it has C—1 dimensions), 


we then use the determinant of the scatter matrices to obtain a scalar objective function: 





Ss] W's,Ww| 
S| —|WTSW 








J(W)= F (4.40) 





and we seek the projection matrix W* that maximizes the above ratio. 


It can be shown that the optimal projection matrix W~ is the one whose columns 


are the eigenvectors corresponding to the largest eigenvalues of the following generalized 


eigenvalue problem, 


ar 
(4.41) 


w= [ »," | Ww, | sis | Wo | = argmax i 


=> (S,-4Sy)w; =0. (4.42) 


3. LDA Example 


The following example demonstrates the use of LDA in classification. 


Given 2 (=C ) classes, 3 (=d ) dimensions, and each with 100 (= 7) samples. 
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—4.33 -1.67 —0.83 —0.27 
€,=)-019 -120 2.19 -0.12 (4.43) 
6.64 5.40 6.62 7.27 
and 
—0.90 0.14 —0.19 —-1.05 
C, =| 0.07 0.28 --- -0.73 -0.06). (4.44) 
-1.44 0.61 -1.70 0.72 


Figure 26 presents three-dimensional plot of the cluster scatters of C, andC,. 


Figure 27 shows the perpendicular projection of the data scatters onto two dimensions for 


visual purpose. Figure 27 shows that it is very difficult to differentiate or separate the two 


classes when they are projected into the two-dimensional space. 





Figure 26. Three Dimensional Cluster Scatters of Class C, (blue) and C, (red). 
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Projected Two-Dimensional Cluster Scatters of Class C, (blue) and C, 
(red). 


Figure 27. 


We now proceed to calculate the within-class S\, and between-class §,, scatter 


matrices. The results are as follow: 


371.1504 100.2341 0.9891 
S,, =| 100.2341 65.9105 45.9159 (4.45) 
0.9891 45.9159 228.5554 


and 


1362.3194 714.9600 -468.1968 
S, =| 714.9600 375.2182 —245.7147 |. (4.46) 
—468.1969 —245.7148 160.9081 


Since we have both S, and.S,,, we can seek the projection matrix W* , which is 


the following: 
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~0.95 -0.06 -0.74 
w*=| 0.27 0.88 0.44 J. (4.47) 
~0.14 -0.46 0.52 


To check that W* maximizes the separability between C, andC,, we project the 


original scatters onto W* in two dimensions. Figure 28 depicts the final results of the 
Linear Discriminant Analysis of the scatters projected along W* onto two-dimensional 
subspace. The green solid line represents the direction of W* in two-dimensional sub- 


space. 





Figure 28. Scatters Projected along W* in Two-Dimensional Subspace. 
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C. PCA VERSUS LDA IN CLASSIFICATION 

PCA is best used in data compression due to its dimensionality reduction capabil- 
ity [1]. In many cases, the Principal Component Analysis smears the classes together so 
that they are no longer linearly separable in the projected space [17]. However, LDA is 
far superior to PCA in pattern classification applications. For example, consider another 
two sets of clusters, which are oriented in the same directions and lie along the first prin- 


cipal component of the overall data correlation matrix, as illustrated in Figure 29. 


0.8 





Figure 29. PCA Results on Two “Unseparable” Clusters. 


The green solid line in Figure 29 represents the first principal component direc- 
tion. Note that class discrimination is not preserved when projecting the two classes onto 


that direction [1]. 


Now, consider the same set of data given from Figure 29. Figure 30 shows the 
projection direction obtained with the Linear Discriminant Analysis process as a green 


solid line. After performing the Linear Discriminant Analysis, the data that was indistin- 
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guishable after applying PCA, is now separated by the Linear Discriminant Analysis. The 
two sets of clusters are no longer oriented in the same directions along the best-projected 


direction of the LDA, which preserves class discrimination. 





Figure 30. LDA Results on Two-Class Problem. 


D. FISHERFACE 

Proposed by Belhumeur, Hespanha and Kriegman [17], the method called ‘“Fish- 
erface” is used to avoid the singularity of the within-class scatter matrix S,, by first pro- 
jecting the image data onto a lower dimensional subspace to produce a non-singular ma- 
trix.S,,. PCA is applied to reduce the dimension from n to n—C, where zn is the total 
number of images in the data set and C is the total number of classes. Next, LDA is ap- 
plied to further reduce the dimension to C—1. This approach was successful with the 


smaller database [1], therefore we also applied the Fisherface approach to the expanded 


database in our follow-on study. 
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E. CLASSIFIER 
The database is divided into two non overlapping sets; the training set contains 

60% of the data per class and the testing set contains the remaining 40%. Next we used 
PCA and LDA to reduce the dataset dimensions. The overall objective is to assign each 
image from the testing set to a corresponding class. First, 50 class-specific centroids are 
extracted from the training data are computed to represent each class. The class-specific 
centroids are obtained by computing the average values of the projected training data for 
each class and used to represent each class [1]. Next, the classification decision is made 
by computing the distance between projected testing image data features and class cen- 


troids, and selecting as class that leading to the smallest distance. 


This chapter presented the basic concepts behind the Principal Component Analy- 
sis and the Fisher Linear Discriminant Analysis. We described projection operations ap- 
plied to extract class-specific information from the training dataset, and the class decision 
process. In addition, examples were presented in this chapter illustrate the differences be- 
tween PCA and LDA. The next chapter presents the overall experimental results obtained 


with our database. 
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V. RESULTS 


To determine the performance of PCA and LDA-based methods used for face 
recognition, we implemented k-fold cross validation. K-fold cross validation is a statisti- 
cal scheme that can be applied to estimate the generalization error of a given model, or to 
select one of several potential models with the smallest estimated generalization error 
[21]. In our case, we want to select the algorithm among various PCA-based schemes and 
the LDA implementation with best classification performances. K-fold cross validation is 
quite successful for small databases. The 1500 image database, corresponding to 50 sub- 
jects with 30 images each was fully utilized. In addition, this chapter presents unfocused 


infrared camera lens impacts on classification performances. 


A. K-FOLD CROSS VALIDATION 

Cross-validation is a method designed for estimating the generalization error 
based on "resampling" [21]. In k-fold cross-validation, the data set is divided into k sub- 
sets and trained k times, each time leaving out one of the subsets from training, but using 
only the omitted subset to compute error criterion. For each k experiment, we used k —1 
folds for training and the remaining for testing. The resulting error is estimated as the 


mean error rate [22] and defined as: 
1 k 
E=-)Ye, (4.48) 


where, e, is error rate of each k experiment. Figure 31 depicts the concept behind k-fold 


cross validation. 
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Total number of examples 


Test examples 


Figure 31. K-Fold Cross Validation. 


Figure 32 illustrates the use of k-fold cross validation in our study. For each ex- 
periment, the database is split into non-overlapping testing and training sets. The training 
set includes 60% of each class data and is used to compute projection directions and class 
centroids. The testing set contains the remaining 40% of the images and is used to test the 
overall classification performance. The process is trained é times that correspond to the 
number of training subsets. Each time one of the subsets from the training set is omitted 
and only this omitted subset is used to compute the mean error. To ensure the k-fold cross 
validation is suitable for our experiment, we tested different & values (100, 200, 500, 900, 
and 1000). Results indicated any values above 200 produce less than 1% deviation in re- 
sulting overall classification performances. Therefore, we chose k = 900 our study for the 
general results. The overall classification performance corresponds to the mean and me- 


dian error rates obtained from all experiments. 
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K-fold Cross Validation in Face Recognition. 


Figure 32. 
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B. CHOOSING EIGENVECTORS 

One of the objectives of our study is to design an effective and efficient model 
that is best suited for IR face recognition. We initially selected the Principal Component 
Analysis for its superiority in dimensionality reduction. First, we define the number of 
“useful” eigenvectors obtained from the PCA, as those associated to eigenvalues equal to 
at least equal 0.1% of the maximum eigenvalue, following the earlier study by Pereira 
[1]. By simply removing eigenvectors with associated eigenvalues below the user- 
specified threshold, we reduce the problem dimension and lower the computational cost 


without degrading the classification performance [17]. 


Figure 33 presents the error rate as a function of the number of eigenvectors used 
for the PCA-based schemes. The maximum number of PCA-based projections is directly 
related to the size of the data used. Some eigenvectors and the corresponding projection 
directions can be eliminated because the associated eigenvalues are close to or equal to 
zeros. In reality, the maximum number of PCA-based projection directions associated 
with non-zero eigenvalues is equal to the size of the dataset when the database size is 
smaller than the data dimension [1]. In this study, the maximum number of PCA-based 
projections directions associated with non-zero eigenvalues is presumably 900 (60% of 
1500 images), which is far less than the maximum number of 2700 (the 6045 pixel 
size). Results show that the best classification performance is obtained with a minimum 


of 50 projection directions. 
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Figure 33. Error Rate vs. the Number of Eigenvectors used in PCA Classification. 


C. INCREASING NUMBER OF CLASSES 

We began our investigations by increasing the number of classes to observe vari- 
ous PCA-based and LDA classification schemes. As mentioned earlier in Sections A and 
B of this chapter, the minimum number of eigenvectors is 50 and the fewest number of 
repetitions for k-fold cross validation is 200, in order to yield satisfactory results. Table 2 
lists the mean error rates obtained for various PCA-based schemes and the LDA imple- 


mentation for 14 to 20 classes in increment of 1 class at a time. 


Using the results obtained from Figure 33, we used the top 50 eigenvectors of the 
PCA-based schemes, which results in mean error rate between 18% and 20% for qualita- 
tive analysis. The following notations are used for various PCA-based schemes and the 


LDA implementation, and all PCA-based schemes prefixed with “PCA”: 


e PCAS50: using the top 50 eigenvectors; 


a | 


e PCASOWI: using the top 50 eigenvectors after removing the top eigenvec- 


tor; 


e PCA50W?2: using the top 50 eigenvectors after removing the top two ei- 


genvectors; 


e PCA50W3: using the top 50 eigenvectors after removing the top three ei- 


genvectors; 


e PCA50W4: using the top 50 eigenvectors after removing the top four ei- 


genvectors; 


e PCA50WS: using the top 50 eigenvectors after removing the top five ei- 


genvectors; and 
e LDA: the LDA implementation. 


Table 2 lists and Figure 34 plots the mean error rates in percentages, expressed as 
a function of the number of classes. Results shows that the best performing PCA-based 
scheme, in the 14—20 class number range, is PCA50W3 for 14 and 15 classes, 
PCA50W4 for 16 and 17 classes, PCA50W2 for 18 and 19 classes, and PCASO for 20 
classes, respectively. Results illustrate that removing first few top eigenvectors does not 
improve the PCA-based algorithm consistently, as we noted in a small database environ- 
ment in the earlier Pereira study [1]. Results also show that the LDA scheme performance 


remains consistently better with under 1% error rate for this study. 


4 | t7.88| tat] 1231] 8.51] _—*10.03| __—*14.31/ 0.62 
45 [16.38 17.30[__11.04[ 8.24) .08| 12.13] 0.42 
[te t561[ ta 20] 17.36| 11.93] 10.41 17.08] 0.34 


te [14221313 12.71] 17.44] 12.94| 13.56 | 0.58 

[te [78.95|—12.77| 12.02 16.52] 12.78| 14.15] 0.67 

[20 | 73.05[ 13.34] 13.08| _1e21[ 13.79] 14.99] 0.76 
Table 2. Mean Error Rate (%) for 14-20 Classes. 
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Figure 34. Mean Error Rate (%) Plot for 14-20 Classes. 


Table 3 and Figure 35 present the median error rates in percentages, expressed as 
a function of the number of classes. Note that median error rates were also considered to 
evaluate the classification performances while minimizing the contribution of significant 
outliers, which would bias mean error rate results. Results shows that the smallest PCA- 
based classification error rate is obtained with PCA50W3 for 14 and 15 classes, 
PCAS0W1 for 16 classes, PCA5O for 17 to 20 classes, respectively. The LDA-based clas- 
sification error rate is above 99.9% for the range of classes considered here. Next, we ex- 
amined the PCA-based and LDA schemes by increasing the numbers of classes in incre- 
ments of 5 classes up to all 50 classes, to investigate the impact the number of classes has 


on overall classification performances. 


| 4 | er] ta] t215|_ 5.21] 7.20 __12.50[0.10] 
P45 2.04[_8.38|___ 7.21] 3.33|___5.88| 8 20[0.08] 
/ 16 | 9.25, 6.83| 8.58] 10.35] 7.00] 8.00] 0.04] 


Par [ate 5.54|_6.50|__11.54[ 6.71] 11.88] 0.00] 

p18 [463 

[19 [5.58] _5.75| 7.83 15.08|___7.13| 12.42] 0.04] 

P20 5.23] 6.71] 10.19] 14.98] 10.69] 12.42] 0.06| 
Table 3. | Median Error Rate (%) for 14-20 Classes. 
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Figure 35. Median Error Rate (%) Plot for 14-20 Classes. 


Table 4 lists and Figure 36 plots the mean error rates in percentages, expressed as 
a function of the number of classes for various PCA-based implementations and the LDA 
scheme. Results show that PCAS50W3 has the smallest error rate for the 15-classes case. 
Results also show that the best PCA-based performance obtained when the class number 
exceeds 20 is PCA5O. The classification performance again degrades by removing the top 
eigenvectors. Results also show that the error rates obtained for the LDA and all PCA- 
based schemes increase as well, as the number of classes increases. This trend is to be 
expected as an increase in the number of classes increases the number of the images in 


the dataset, and with it the likelihood of misclassification. 


F of Classes 
15 
20 
25 
30 
35 
40 
45 
50 
Table 4. | Mean Error Rate (%) as a Function of the Number of Classes (15-50 Classes in 


Increment of 5 Classes). 
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Figure 36. Mean Error Rate (%) as a Function of the Number of Classes (15-50 
Classes in Increment of 5 Classes). 


Table 5 summarizes and Figure 37 plots median classification error rates, ex- 
pressed as a function the number of classes considered in the recognition. Results show 
that the best PCA-based scheme is PCA50W3 for the15-class case. Results also again 
show that removing the first top few eigenvectors degrade performances results, and that 
the smallest error rate is obtained consistently with PCA5SO when the number of classes is 
20 or above. Finally results also show the consistently better classification performance 


obtained for the LDA implementation. 


- 20 523|__ 671] 10.19] 14.98] 10.69] 12.42] 0.06 


Table 5. | Median Error Rate (%) as a Function of the Number of Classes (15-50 Classes in 


Increment of 5 Classes). 
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Figure 37. Median Error Rate (%) as a Function of the Number of Classes (15-50 
Classes in Increment of 5 Classes). 


D. REMOVING CLASS WITH THE HIGHEST ERROR RATE 

Simulations showed that Class #13 had a significantly higher error rate than other 
classes in the database, and specific details regarding this class performance are included 
in Appendix B. We could find no specific reason for that specific class having signifi- 
cantly lower performance as compared to the other classes available for the study. Thus, 
we investigated the specific impact this “outlier” class had on overall results by removing 


it from the database and re-evaluating error rates. 


Table 6 lists and Figure 38 plots mean error rates in percentages, expressed as a 
function of the number of classes, with class #13 removed from consideration. Results in 
Figure 38 indicate a small improvement in the average error rate by removing Class #13. 
Results also show classification performance degradations are observed by removing the 
first top eigenvectors when the class number size reaches 25 or higher. In addition, LDA 
performances are again significantly better than those obtained with any of the PCA- 
based. 
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Pts | 1150] 8.74] —<B.77|_—t2.18|__—«7B|__17.36 0.33] 


[20 |-41.05| 10.47] 17.05] 15.49] 11.72] 14.12] 0.79] 
[25 3.88] 1261] _14.23[ 1841] 14.70] __17.66[1.57| 


Table 6. | Mean Error Rate (%) as a Function of the Number of Classes (15-25 Classes); 
Outlier Class #13 Removed from the Database. 
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Figure 38. Mean Error Rate (%)as a Function of the Number of Classes (15-25 
Classes); Outlier Class #13 Removed from the Database. 


Table 7 lists and Figure 39 plots the median error rates for the same experiment. 
Again PCA50W1 is best the performing PCA-based implementation for 15-class case. 
Thereafter, only PCA50 is suitable for class size greater than or equal to 20 classes. LDA 


has less than 0.4 % error rates. 


Pts | 7a7|_—6.50| 8.10] 10.80] _—6.65| 10.51] 0.04 


[20 | 420[__7.77|___ 856] 11.67| ___7.98| 12.79] 0.02] 
P25 48a] 8.92] sae] A] B]_t7-17] 0.33} 
Table 7. | Median Error Rate (%)as a Function of the Number of Classes (15-25 Classes); 


Outlier Class #13 Removed from the Database. 
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Figure 39. Mean Error Rate (“%) as a Function of the Number of Classes (15-25 
Classes); Outlier Class #13 Removed from the Database. 


E. OVERALL PCA-BASED CLASSIFICATION RESULTS 

For reference, Figure 40 plots the average error rates obtained with a direct im- 
plementation (no PCA or LDA implementation). For the direct implementation, we com- 
puted the class-specific centroids using the original training set without dimensionality 
reduction. Next, the minimal distance classifier was applied to each testing image and 
compared to all the class-specific centroids. In practice, we want to avoid the direct im- 
plementation since it requires the most computational cost and it does not necessarily 
produce the most accurate results, and this is why we have to develop a better classifica- 


tion scheme. 


Results show that PCA-based error rates appear to stabilize by selecting around 
the top 50 eigenvectors (Figure 33). Therefore, we investigated the specific PCA-based 
classification performances obtained by selecting all, the top 80, and top 50 eigenvectors. 
The 900 iteration cross validations scheme was applied to all PCA-based implementa- 
tions. The reason we used 900 repetitions was to ensure we trained most of the training 
images each time we selected different testing/training combinations for the entire k-fold 
cross validation procedure. Recall that all PCA-based classifications were results from 
the maximum dimensionality reduction of 900, which corresponds to 900 training im- 


ages. We examined different PCA-based classification, including removing the first few 
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top eigenvectors for completeness, and applied k-fold cross validation. Results are shown 


in Figures 41-52: 


PCAA: using all eigenvectors (Figure 41); 


PCAWI: using all the eigenvectors after removing the top eigenvector (Figure 


42); 


PCAW2?: using all the eigenvectors after removing the top two eigenvectors (Fig- 


ure 43); 


PCAWS3: using all the eigenvectors after removing the top three eigenvectors 


(Figure 44); 


PCA80: using the top 80 (i.e., the eigenvectors associated with the 80 largest ei- 
genvalues) (Figure 45); 


PCA80W1: using the top 80, after removing the top eigenvector (Figure 46); 
PCA80W2: using the top 80, after removing the top two eigenvector (Figure 47); 
PCA80W3: using the top 80, after removing the top three eigenvector (Figure 48); 
PCASO: using the top 50 (Figure 49); 

PCASOW1: using the top 50, after removing the top eigenvector (Figure 50); 
PCAS50W2: using the top 50, after removing the top two eigenvector (Figure 51); 


PCAS0W3: using the top 50, after removing the top three eigenvector (Figure 52). 
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Figure 40. 
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Figure 41. 
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Figure 42. 
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Figure 43. 
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Figure 44. 


64 


4 
3.5 b---------- 


eee eee ee, 


yocceeeeoee 


es eee ee ors 


Vetted 


eee ee 





Gikesscecsccoe 
5 : 
Dieeseecesaes 


SUOIEINUIS JO % 


1.5 }----------- 


< 


1 
1 
1 
' 
1 
1 
1 
+ 
‘ 
1 
' 
1 
1 
1 
1 
‘ 
1 
1 
+ 
1 
1 
1 
1 
' 
‘ 
1 
' 
1 
1 
1 
rs 
1 
‘1 
1 
1 
1 
' 
1 
1 
1 
1 
1 
rs 






Lececeescoccelesecsceu ceelocose se eeeel a cemen somes 


0.5[----------- 





Error Rate (%) 
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Figure 45. 
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Figure 46. 
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Figure 47. 
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Figure 48. 
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Figure 49. 
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Figure 50. 
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Figure 51. 
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Figure 52. PCAS50W3 Classification Error Rate (%) Histogram 


Figures 40, 41, 45, and 49 produced similar results as mean error rates are be- 
tween 20% and 22%. There is no significant degradation on the system performance as 
we compare the direct classification with performances obtained with PCAA (using all 
eigenvectors) and PCA80 (using top 80 eigenvectors). We observe a slight performance 


degradation for the PCA50 implementation. 


No significant improvement for removing any of the top three eigenvectors as 
shown in Figures 42-44. The same behavior occurs in Figures 46-48 and 50-52. As men- 
tioned in Section C and D of this chapter, the PCA-based algorithms must maintain at 
least top 50 eigenvectors in order to perform classification. Results suggest that the PCA- 
based algorithms have limitations and no longer produce accurate classifications with 
larger database. Results show that the mean error rate of the best PCA scheme (PCAA) is 
20.94%, followed by PCA80 (21.36%) and PCAS0 (22.57%), which is far from what we 


desire for good classification systems. 
da 


F. OVERALL LDA-BASED CLASSIFICATION RESULTS 

Direct classification was less than 1% lower in mean error rate than the best PCA- 
based scheme; however, the direct classification requires much more computational load 
than PCA-based and LDA schemes. This result illustrates the fact that PCA is best 
matched for dimensionality reduction, but is not specifically designed for classification 


application. 


Figure 53 shows classification performances obtained with the LDA-based Fish- 
erface implementation. Results show that the mean error rate is 5.42%, which is 15.52% 
less than the best PCA-based classification (PCAA). It is also far better than the direct 
classification (14.74% less in mean error rate). Remember our training set contains 900 
images, so 5.42% error rate means about 49 out of 900 images produced erroneous classi- 
fication, which is obviously outperforming PCA that yields about 189 erroneous classifi- 


cations. 
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Figure 54 plots the distances between a single image of Class #50 and all 50 train- 
ing set centroids, and can be viewed as a measure of robustness of LDA-based classifica- 
tion implementation. The bar is used to indicate the minimum, mean, and maximum dis- 
tances from subject #50 to any class-specific centroid. The vertical axis represents the 
distance between the Class #50 and the class-specific centroids. As shown in Figure 54, 
the distance separation is smallest in Class #50 that means the images from that class are 
very close to each other, while being quite well separated from all other classes. Similar 


outcomes were observed for all other classes. 
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Figure 54. LDA-Based Classification. 
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Table 8 summarizes the classification performances. The mean error rate obtained 


for each classification scheme is shown at the bottom of the table for comparison. Results 


show that All (PCA-based classification using all eigenvectors) has best performance 


among all PCA-based schemes investigated, followed by PCA80 (using top 80 eigenvec- 


tors) and PCAS0 (using top 50 eigenvectors). Results also show that degradation in per- 


formance when removing the first few top eigenvectors for PCA-based classifications. 


LDA has the best performance and the lowest mean error rate of 5.42%. 
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Class #| Direct | PCAA | PCAW1| PCAW2 | PCAW3 | PCA80 | PCA80W1 | PCA80W2} PCA80W3 | PCA50 | PCA50W1 | PCASOW2 |} PCA50W3| LDA 
1 45.97 46.34 33.31 24.61 31.33] 46.70 33.54 25.51 32.40 47.72 34.23 28.69 34.51] 9.52 
2 38.18] 36.69] 28.00] 33.83] 38.65 37.14 34.52] 39.73 38.38 36.43 6.64 
3] 15.44 16.11 16.08 32.00 26.82| 16.78 16.47 33.00 27.88 18.09 17.67 35.40 30.34] 2.50 
JE 15.52| 2.51 0.03) 0.05] 16.32 aa 0.06] 18.55 pee 0.20] _0.00 
5 0.03 0.03 15.31 25.45 24.41 0.03 15.38 25.74 24.73 0.03 16.58 26.91 26.55| 0.03 
6 17.70 4.75 6.84 7.02} 18.23 5.08 7.15 7.40 19.31 6.06 7.33 7.71] 0.22 
8] 11.97 12.21 24.24 35.30 29.40] 12.38 24.71 35.65 29.93 12.77 25.72 36.33 31.18] 4.18 
9 0.45[ _0.04[ 0.00 0.14] 0.44 0.04 0.121 0.50 0.06 0.00 0.18) 0.52 

TH 14.31] 22.44) 21.07) 21.37) 14.73) 23.32 23.54| 0.78 
12 3.57|___3.53|_ 2.33) 5.61] 3.74 3.67 6.58| 0.19 
13 77.43| _85.12| 64.33] 63.72| 78.09] 85.95 68.56] 2.80 
14 4.88[ 3.83] 2.72] 3.68] 4.96 3.88 3.89] 0.06 
15] 0.79] 0.94] 10.92] 11.11 10.75] 1.09] 11.18 11.73] 0.01 
16 4.07|_ 2.07) 5.38] 6.36| 4.28 2.17[ 5.60) 6.85 4.73) 2.56 6.60] 8.57| 0.86 
50 7.83 12.03] 17.44) 23.82) 8.23) 12.49 26.53] 4.01 
51 2.05 24% 2.14 4.81 4.83 2.33 2.44 5.03 4.76 3.19 3.13 6.25 6.09] 0.50 
52 456] 9.46] 7.38] 13.07| 4.69 9.38 13.23[ 5.31 10.89 14.40| 1.63 
53} 31.59 31.75 38.64 39.83 37.28] 31.84 38.70 40.07 37.07 32.51 38.90 40.54 37.27| 14.33 
=| 1.39| 8.46] 10.46[ 10.78[ 1.50 +43} aoe 11.21 1.89 542 12.51| 0.31 
55} 14.11 14.88 42.83 36.90 32.85} 15.73 43.82 38.07 33.70 17.43 46.58 40.69 35.60] 1.41 
=| 14.83] 11.58| _7.47[__17.25| 15.63 oe] a 17.87|__17.83 a 20.42] 4.14 
57} 14.10 14.71 24.39 21.44 24.23] 15.78 25.69 22.67 25.80 18.64 29.90 26.14 29.32| 4.38 
58 35.22] 38.47] 43.02] 46.29] 35.52 38.77 46.63] 35.77 40.32 47.08] 12.76 
59 14.02] 16.41) 18.90) 21.05) 14.22) 16.57] 19.18] 21.29] 15.18] 17.19] 19.46] 22.32 6.58 
60 51.91] 13.33 
61 6.68} 10.98) 9.39) 12.99] 6.87] 10.80 13.85] 1.51 
62 10.93] 15.69] 19.15, 21.23] 11.39 16.05| __19.60| 21.76] 11.98] 17.19] 20.78 22.62] 10.94 
63 17.20] 36.94) 41.80) 61.52) 17.42) 37.31 64.25] 5.99 
64 32.18) 41.24] 40.76] 40.60] 32.52] 42.24 25.50| 11.97 
65 26.15| 17.06] 19.73] 25.73] 26.64] 17.48 28.16] 4.31 
66 64.57] 6.83 
67 13.77| 19.09] 22.55] 27.75) 14.77) 20.37 30.94] 1.48 
68] 50.09 51.05 58.94 40.10 41.86] 51.97 59.93 40.86 43.07 54.02 62.68 41.97 43.69| 17.63 
69 30.78] 32.68] 39.24] 36.71| 31.98 33.96 38.33] 34.50 36.95 42.21| 5.57 
70} 24.57 25.59 31.02 28.70 16.32] 26.04 31.66 29.81 16.94 28.39 34.77 34.30 20.67| 3.22 
71 20.38] 26.43] _23.81| _29.37| 20.37 26.48 29.46] 20.18 26.19 29.56 2.57 
72) 21.69 22.25 32.26 35.62 32.01 22.53 32.13 36.44 32.69 23.49 32.94 38.13 34.17] 9.42 
73 55.83] 48.24] _44.37| _47.26| 56.68 49.82 49.41] 57.80 52.57 53.83] 7.50 
74} 30.99 32.58 15.19 21.13 24.00] 33.45 15.88 22.29 25.24 35.35 18.48 26.40 29.37| 2.76 
a 7.10 15.82] 23.31] 26.75| 7.51 16.48 27.44| 8.77 nel 30.85] 3.49 
76 31.85] 44.64. 39.64] _53.17| 33.19 45.94 55.21| 36.70 48.16 58.35] 9.26 
77 22.23) 37.71] 40.35] 54.44] 22.56 38.54 55.26] 24.16 40.29 57.66] 8.38 
78 11.72] 20.03 21.66[ 21.19] 12.32] 10.04 
79 28.62| 36.23) _28.57| _30.56| 29.35 36.79 31.42] 30.36 38.44 33.56] 4.95 
80 26.12| 27.85] 29.32| 31.95] 26.30| 27.84 32.25| 3.76 
81 36.67] 34.10] 35.21] 46.28] 37.14 34.38 48.23] 16.41 
82 11.29] 14.54) 12.67) 12.01) 11.42) 14.72 13.65| 1.14 
83] 28.63 29.96 39.61 34.90 40.94] 30.94 40.34 35.45 41.73 32.57 41.42 37.01 43.32| 17.52 
84 14.14[ _8.29| 1247| 20.45| 14.77 9.10 21.19] 15.83 9.91 22.47| 9.78 
85} 15.16 16.71 23.74 35.72 26.52| 17.33 24.53 36.30 27.04 18.97 26.71 39.02 29.28] 2.94 
mean 20.78| 24.54] 24.99] _27.56| 21.26 25.06 25.61 28.19| 22.47 26.61 29.97| 5.42 
median 16.11 16.96 21.45 24.21 26.63| 17.38 22.56 24.89 27.24 18.84 24.87 27.07 29.46] 4.07 

Class # PCAA | PCAW1| PCAW2| PCAWS3 | PCA80 | PCA80W1 PCAs0W3| PCAS50 | PCA50W1 PCA50W3[ LDA 
Table 8. — Error Rate (%) Per Class. 





G. UNFOCUSED INFRARED CAMERA LENS EFFECTS 

The infrared camera lens was intentionally unfocused for one image of each set to 
investigate optical malfunction impacts on the uncooled thermal sensor performance. An 
additional image was obtained from each subject with neutral facial expression facing 
straight at the camera (Mark #5 in Figure 7). The infrared camera was manually unfo- 
cused to reduce the resolution and to blur out the images. Figure 55 and 56 show a sam- 
ple unfocused image and its corresponding cropped image. The cropped image replaced 
the original focused one with the same facial expression and head orientation in the data- 


base. 


The k-fold cross validation was performed, and Table 9 summarizes cross- 
validation results. Results indicate that there is only less than 1% deviation in mean error 
rates per class between the original and unfocused datasets. The results also suggest that 


the infrared cameras can still perform successfully with reduced resolution. 





Figure 55. Unfocused IR Image Sample. 


Figure 56. Unfocused IR Cropped Image. 
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Table 9. 

















Unfocused Camera Lens Error Rate in % Per Class. 








Class #| Direct [| PCAA | PCAW1 | PCAW2 | PCAW3| PCA80 | PCA80W1 | PCA80W2 | PCA80W3 | PCA50 | PCA50W1 | PCASOW2 | PCA50W3| LDA 
1 28.05 34.30| 9.45 
at 37.76 38.18] 37.09] 28.60] 34.36] 38.65] 37.50] 29.271 34.97] 39.64] 38.82] 31.11 36.82| 7.18 
3[_ 15.42 16.05] 16.09] 31.19f 25.82] 16.64] 16.45] 32.22[ 26.97] 17.97] 17.65 34.69 29.19] 2.54 
4{14.45[ 15.54] 2.57] 0.047 0.04] 16.25] 2.98] 0.06.06] 18.57] 4.33] 0.12 0.15| 0.00 
s[ 0.03 0.03[ 14.98] 25.35[ 23.53] 0.03] 15.13] 25.67] 23.88] 0.03] 16.29] 26.83 25.80] 0.01 
ef t7.teft7.73t_ 471] 6.95[ 6.67] 18.28] 5.09] 7.26] 6.89] 19.39] 6.15 7.50 7.20{ 0.25 
sf 14.99[ 12.19] 24.16] 35.50[ 29.50] 12.38] 24.64] 35.75] 29.97] 12.82] 25.70] 36.67 31.31] 4.32 
gf o4ito.44t0.04f 0.00] 0.12] 0.43f0.04f 0.00.10] 0.48] 0.06 0.00 0.15] 0.58 
11 23.32 23.43| 0.71 
taf 3.43f 3.56] 3.78] 2.24f 5.33] 3.70] 3.94] 2.55f 572] 10414] 3.72 6.43] 0.20 
13| 76.82 : : : 85.72 64.22| 78.87 68.83 67.98| 2.91 
14 472 3.91 3.56| 5.99 3.13 3.73| 0.08 
15 0.77 11.59 1119] 2.02 13.32 12.32| 0.02 
16 2.24 6.81| 4.69 6.47 8.53| 1.20 
50 8.80 12.24 22.58 16.05 24.70| 6.65 
51 4.28 8.74 7.90| 2.18 
52| 270 _3.00| 6.65 6.82 13.04] 2.02 
53[28.48[ 28.77 36.87| 38.421 34.93] 28.74] 36.91] 38.55] 34.96] 29.60] 37.22] 38.88 35.15] 13.02 
Sat t.tef 1.36] 8.40] 9.97 t0.71T 142] 8.65] 10.14f 14.20] 1.90] 9.86 11.69 12.48] 0.31 
se 10.83[ 14.61] 44.11] 36.93[ 31.63] 12.32] 45.33] 37.98] 32.63] 13.80] 48.69] 40.88 34.98] 1.44 
se 12.68 13.34] 12.55] 8.91[ 20.48] 13.74] 13.13] 9.35] 21.05] 15.79] 15.34] 11.02 22.74| 3.94 
s7[_15.88[ 16.44] 23.57] 19.721 22.62] 17.57] 24.81] 20.93] 24.29] 19.87] 28.51] 24.87 28.51] 3.31 
5st 39.10 39.45] 39.31] 46.46[ 49.06] 39.73] 39.79] 46.87] 49.43] 40.23] 41.07] 47.34 49.93] 11.35 
sof 12.06 12.88] 13.92] 16.53[20.46f 12.90] 14.11] 16.81 20.69] 13.82] 14.70] 16.97 21.65| 6.04 
Gol 39.83 40.531 53.76] 52.90] 50.27] 41.07] 54.41] 53.80] 54.20] 42.52] 56.32] 56.40 52.91] 13.92 
61 12.21 16.77| 2.06 
62| 11.59 11.46] 15.30 i E ; 15.70 5 k ; : 20.19 21.65| 10.84 
63|__17.21|__17.61| _ 37.10 37.43 43.70 64.05| 6.56 
64| 31.09 31.83] 41.11 41.88 45.57 44.30| 11.95 
65|__23.49| _24.55| 12.13 12.48 18.68 23.49| 4.19 
66|__17.37|__18.21| 21.08 22.49 40.81 60.83| 8.10 
67 : ; i ; : : 27.24 33.44| 2.67 
6sf48.49[ 49.56] 57.57] 39.93[ 41.19f 50.21] 58.39] 40.57] 44.94] 52.10] 61.11] 41.60 42.59] 17.98 
e9[ 28.32 29.38] 31.64] 38.82] 36.28] 30.44] 32.92] 39.96] 37.91] 33.12] 35.79] 42.93 41.69] 5.82 
ol 24.19 25.17] 31.23] 28.55[ 16.56] 25.73] 34.79] 29.53] 17.16] 28.18] 35.39] 34.30 20.74| 3.55 
ia 24.58 29.58] 2.45 
7222.01 22.86] 30.78] 34.19[31.44f 22.97] 30.61] 35.14f 32.17] 23.97] 30.94 36.52 32.96] 10.60 
73{_ 57.16 59.56] 47.95| 43.21 50.82 53.09] 7.06 
74 30.26[ 34.71] 14.19] 17.34[ 19.82] 32.46] 14.81] 18.56f 21.00] 34.01] 17.14] 22.13 24.75] 2.73 
ml 6.21[ 6.87] 15.89] 23.74T 27.20] 7.08] 16.57] 24.89] 28.01] 8.24] 19.45] 27.55 31.55] 3.13 
76, 32.04[ 33.48 42.72] 38.21[ 51.01] 34.56] 43.72] 39.05] 53.08] 37.20] 45.64 40.48 55.64| 10.15 

i : 3 38.78 54.69 42.74 57.00| 8.31 

20.06 11.76 21.22 12.07| 10.35 

37.18 31.14 31.36 33.37| 4.84 

27.85 32.03 30.07 32.28| 4.00 

36.68 46.90 37.20 48.16| 15.87 

15.87 15.31| _3.03 

83] __29.48[ _31.09| 40.31 37.76 44.78| 17.27 
sat 12.94 14.40] 7.29] 9.89 19.06f 15.07] 8.07] 10.33] 19.54] 16.07] 8.57] 11.20 21.05] 9.81 
ssf 17.34 18.73] 25.28] 36.97[ 26.97] 19.19] 25.97] 37.56] 27.40] 20.78] 28.11 40.04 29.57] 3.94 

mean | 20.29] 20.94] 24.51] 24.90] 27.25 21.36] 25.03] 25.49 27.86] 22.57] 26.57] 27.20 29.64] 5.62 

median 27.24 29.57| 4.00 

Class #] Direct [ PCAA | PCAW1 | PCAW2 | PCAW3| PCaso | PCAs0W1 | PCA80W2 | PCAs0W3 | PCASO | PCASOW1 | PCAS0W2 | PCAS0W3| LDA 


This chapter presented overall PCA and LDA-based classification result. Our re- 


sults confirmed that the Linear Discriminant Analysis is far superior to the Principal 


Component Analysis in classification applications. However, results also showed that 


removing the top eigenvectors did not benefit the PCA-based algorithms when dealing 


with a larger database. The investigation on unfocused camera lens effects further con- 


firmed the high performance of infrared imagery under uncontrolled environments or not 


optimal conditions. 


78 





VI. CONCLUSIONS 


This study investigated face recognition using an uncooled infrared camera with 
an expanded database. A database containing 420 facial images obtained from 14 volun- 
teers was available from previous study. An additional 1080 images from 36 volunteers 
were included in the expanded database, resulting in a total of 1500 images. Each subject 
was required to perform three different facial expressions with 10 head different orienta- 
tions. Facial expressions considered were neutral sitting, smiling, and pronouncing the 
“u’” vowel. The distance between the subject and the camera was kept constant while 
permitting a vertical and horizontal angle freedom of 10°. In addition, 36 images (one 
from each subject from Class #50 to #85) were collected with an intentionally unfocused 


camera lens for additional scheme analysis. 


We developed an automatic image cropping technique to process large amounts 
of images. In addition, we also implemented an automated data storage and upload 
mechanism. This study mainly focused on two linear schemes to examine infrared imag- 
ing in face recognition. The first linear scheme used was the Principal Component Analy- 
sis (PCA). The second linearity approach was the Fisher Linear Discriminant Analysis 
incorporated with the PCA for dimension reduction and classification. The minimum dis- 
tance classifier was selected for its simplicity and accuracy in classification applications. 
Different PCA-based and LDA schemes were compared using a 60/40 k-fold cross vali- 


dation scheme with 900 iterations. 


Results show that uncooled infrared imaging is a viable candidate for face recog- 
nition applications, and that the LDA approach is far superior to the various PCA-based 
classification algorithms investigated in this study, leading to a 94.58% average classifi- 
cation performance. However, PCA and LDA have limitations. PCA cannot exploit extra 
compression associated with nonlinear relationships. LDA assumes that class means con- 
vey most class information; therefore, the LDA favors unimodal data distributions. In ad- 
dition, LDA cannot separate nonlinearly separable data sets and classes with the same 
mean [20]. Extensions to nonlinear classification algorithm are currently under study to 


investigate whether they lead to better classification performances. Finally, results pre- 
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sents here were derived from images collected under very controlled environmental con- 
ditions: fixed distance, indoor environment, and limited facial expressions. Extensions to 
the study should include relaxing such constraints and investigating resulting impacts on 


classification performances. 
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APPENDIX A. MATLAB SOURCE CODES 


This appendix contains all MATLAB source codes used for this study: 


AutoCrop,m automatically crops IR images and converts the resulting cropped 


images in .bmp format; 

readpgms.m read images obtained from IR camera in .pgm format; 

top.m automatically crops top and bottom portions of IR images; 

side.m automatically crops left and right portions of IR images; 
load_any_img_to_matrix loads cropped images into a single matrix; 

pca_dr.m produces a PCA dimensionality reduction example; 

pca_ex.m produces a PCA classification example; 

Ida_ex.m produces a PCA classification example; 

pca_vs_lda.m produces an example to compare PCA and LDA implementation; 
pca.m performs PCA implementation; 

sortem.m sorts eigenvalues and corresponding eigenvectors in descending order; 
fid.m performs LDA implementation; 


pca_eig.m performs PCA implementation and plots error rates as a function of 


number of eigenvectors; 


variation.m performs LDA implementation and plots maximum, mean, and mini- 


mum distance from class-specific images to class-specific centroids; 


kcv.m performs k-fold cross validation. 
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AutoCrop 


ORSE TOE CE EME SE Se RRC OUTS SOOSCED ASA SE DERI NEES TS Se OM CE MN CRE SCE RONEN SS SOON TRE ENE CE On ese Eee a SO 
% 

% Filename: AutoCrop.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 

% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 

% Author: Colin K Lee, Naval Postgraduate School, Monterey, CA, 2004 

% Descriptions: Routine program used to read the binary PGM image data 

% obtained from IR-160 camera, to crop image automatically to 

% 60x45 pixels, and to convert and rename images file in bmp 

% format. 

% Inputs: 

% Person: Class number assigned to each subject 

% Outputs: Cropped image data saved in bmp format 

% Function(s) called: readpgm8.m, top.m, and side.m. 

% 

ORE ae ee Ts bau ee CEG aati Oe Pi ere Saal wt ee eres 
cle 

clear 


Person=[83] ; % Person numbers contained on the files 
N_pictures=10; % number of pictures for each person number 
Section=1; 


fo 


r 1=1:length(Person) 

for j=1:N_ pictures 
im_num1= num2str(Person(i)); im_num2= num2str(j);im_num3=num2str(Section); 
img name = streat(im_num1,'-',im_num2,'-'",im_num3,'.pgm'); 
A=readpgm8&(img name); % read pgm files 
B = top(A); 
C = side(B); 
namefinal=strcat(im_num1,'-'",im_num?,'- 
imwrite(C,namefinal,'bmp'); 

end 


[ees . 
im_num3,'-a','.bmp'); 


end 


Section=2; 
for i=1:length(Person) 


for j=1:N_ pictures 
im_num1= num2str(Person(i)); im_num2= num2str(j);im_num3=num2str(Section); 
img name = strceat(im_num1,'-',im_num2,'-",im_num3,'.pgm'); 
A=readpgm8&(img name); % read pgm files 
B = top(A); 
C = side(B); 
namefinal=strcat(im_num1,'-'",im_num2,'-'",im_num3,'-a','.bmp'); 
imwrite(C,namefinal,'bmp'); 
end 
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end 


Section=5; 
for i=1:length(Person) 
for j=1:N_ pictures 
im_num1= num2str(Person(i)); im_num2= num2str(j);im_num3=num2str(Section); 
img name = streat(im_num!1,'-',im_num2,'-'",im_num3,'.pgm'); 
A=readpgm8&(img name); % read pgm files 
B = top(A); 
C = side(B); 
namefinal=strcat(im_num1,'-'",im_num?2,'-'",im_num3,'-a','.bmp'); 
imwrite(C,namefinal,'bmp'); 
end 
end 


Section=6; 
for i=1:length(Person) 
for j=1:N_ pictures 
im_num1= num2str(Person(i)); im_num2= num2str(j);im_num3=num2str(Section); 
img name = streat(im_num1,'-',im_num2,'-'",im_num3,'.pgm'); 
A=readpgm8&(img name); % read pgm files 
B = top(A); 
C = side(B); 
namefinal=strcat(im_num1,'-'",im_num?,'-',im_num3,'-a',' 
imwrite(C,namefinal,'bmp'); 
end 
end 


-bmp'); 
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readpgm8 
function image = readpgm8(filename) 


% 

% Filename: readpgm8.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 


% Author: Matthew Dailey, 1997 

% Modified by: Colin K Lee, Naval Postgraduate School, Monterey, CA, 2004 
% Descriptions: reads the binary PGM image data obtained from IR-160 camera 
% Inputs: 

% filename: binary PGM image data 

% Outputs: 

% image: 2-dimensional array of integers 

% Function(s) called: none 

% 

OAS ees ended soa ht da tacs ee Ades ose ae fal ats eee lacwet ett SAEs 
function image = readpgm8(filename) 

% Open the file 


fid = fopen(filename,'r'); 
% Parse and check the header information. No # comments allowed. 
A = fgets(fid); 
if stremp(A(1:2),'P5') ~= 1 
error('File is not a raw PGM'); 
end; 
A = fgets(fid); 
if stremp(A(1:2),'#') ~= 0 
error('File is not a raw PGM'); 
end; 
A = fgets(fid); 
sizes = sscanf(A,'%od'); 
w = sizes(1); 
h = sizes(2); 
A = fgets(fid); 
max = sscanf(A,'%od'); 
tlength = w*h; 
if max ~= 255 
error('Cannot handle anything but 8-bit graymaps'); 
end; 
% Read the raw data 
[v,count] = fread(fid,inf,'uint8=>uint8'); % this makes v a uint8 instead of double 
“if count ~= tlength 
% error('File size does not agree with specified dimensions.'); 
%end; 
v=v(1:tlength, 1); 
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% Pack the column vector v into the image matrix 
image = reshape(v,w,h)'; 
fclose(fid); 
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top 
%function [CC] = top(B) 


% 

% Filename: top.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 


% Author: Colin K Lee, Naval Postgraduate School, Monterey, CA, 2004 

% Descriptions: crops top and bottom portions of image data, results in image 

% containing facial characteristics above chin and below eyebrows 
% Inputs: 

% B: image data martix 

% Outputs: 

% CC: cropped image data matrix . 

% Function(s) called: none 

% 

AEE MSE SOD ASR) Oe OS here Oe aE ane RE Oe EERE AO Seed Se eae aE ee meee en? 


function [CC] = top(B) 
rotate _B = rot90(B); 
[i,j] = find(rotate_B); 
z= (aj); 
s = 2(1,2); 
[rowNum colNum]=size(rotate_B); 
for j=s:colNum; 
for i=1:rowNum; 
C(i,j-st+1)=rotate_B(i,j); 
end 
end 
midP=round((colNum-s)/2); 
CC=C(:,midP-18:midP+41); 
CC=rot90(CC,3); 
return 
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side 
% function [DD] = side(B) 


% 

% Filename: top.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 


% Author: Colin K Lee, Naval Postgraduate School, Monterey, CA, 2004 

% Descriptions: crops side portions of image data 

% Inputs: 

% B: image data martix 

% Outputs: 

% DD: cropped image data matrix in 60x45 pixels 

% Function(s) called: none 

% 

Of Ae ade Sa eo See a ee ee eos 


function [DD] = side(B) 


[i,j] = find(B), 
z= [ij]; 
s = 2(1,2); 
[rowNum colNum]=size(B); 
for j=s:colNum; 
for i=1:rowNum; 
C(i,j-st1)=B(,)); 
end 
end 


CC = flipIr(C); 
[m,n] = find(CC); 
zf = [m n]; 
sf = zf(1,2); 
[rowNumf colNumf]=size(CC); 
for n=sf:colNumf; 
for m=1:rowNumf; 
DD(m,n-sf+1)=CC(m,n); 
end 
end 
D = flipIr(DD); 
[h k] = size(D); 
midP=round(k/2); 
DD=D(:,midP-26:midP+18); 
Return 
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load_any_img_to_matrix 


% function [A,T] = load_any_img to _matrix(C,O,S) 


% 

% Filename: 

% Thesis Advisor: 

% Thesis CoAdvisor: 
% Author: 


load_any_ img to _matrix.m 

Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 
Colin K Lee, Naval Postgraduate School, Monterey, CA, 2004 


% Descriptions: load any image into a single matrix 

% Inputs: 

% C: Class # 

% O: number of images 

% S: Section # of specific facial expression 
% Outputs: 

% A: a single data matrix 

% T: Class # 

% Function(s) called: none 

% 

Otte hi a et al th a Te I Se es ee dak Oe al ee hk eS 


function [A,T] = load_any_img to _matrix(C,O,S) 
Person = C; 

N_ objects = length(O); 

N_class=length(Person); 


Section = S; 
A= 
f= 


for i=1:N_ class 
for j=1:N_ objects 
for k=1:length(Section) 
im_num1= num2str(Person(i)); 
im_num2= num2str(j); 
s=num2str(Section(k)); 
img name = strcat(im_num1,'-'",im_num2,'-'\s,'~','a'); 
img= imread(img_name,'bmp'); 
[dim1,dim2] = size(img); 
x=reshape(img,dim1*dim2,1); 
A=[A x]; % A contains one image in each column 
T=[T Person(i)];% T contains the class of each image 
end 
end 
end 
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% 

% Filename: pcea_dr.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 
% Author: M. Cairns 


% Modified by: Colin K Lee, Naval Postgraduate School, Monterey, CA, 2004 

% Descriptions: Routine program to demonstrate the use of PCA in dimensionality 
% reduction. 

% Inputs: none 

% Outputs: none 

% Function(s) called: none 

% 

a an Ci SO oe ce ee a See on ae ee ee a a 
cle 

clear all; 

close all; 


% Generate elliptical cloud of data-poitns. 
x(1,:) = randn(1,100); 

x(2,:) = randn(1,100)*3; 

% Rotate the cloud for demonstration. 
[p(1,:),p(2,:)] = cart2pol(x(1,:),x(2,:)); 
p(1,:) = p(1,:)-pi/3; 

[x(1,:),x(2,:)] = pol2cart(p(1,:),p(2,:)); 

% Plot data. 

figure(1) 
scatter(x(1,:),x(2,:)); 
axis equal; 

% Calculate PC's. 
[pc, latent, explained] = pcacov(cov(x')); 

% Draw PC's on top of data. 

figure(2) 

scatter(x(1,:),x(2,:)); 
axis equal; 
hold on 
plot([-4 4]*pe(1,1),[-4 4]*pe(2,1),"r"); 
hold on 
plot({-2 2}*pe(1,2),[-2 2]}*pe(2,2),'-; 
hold off; 

% Rotate the data to the PC's 
y = (x"pe)'; 

% Plot data. 
figure(3) 
scatter(y(1,:),y(2,:)); 
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axis equal; 


% Calculate PC's, to demonstrate they now lie on the axes. 
[pce2, latent, explained] = pcacov(cov(y')); 
% Draw PC's on top of data. 
figure(4) 
scatter(y(1,:),y(2,:)); 
axis equal; 
hold on; 
plot([-4 4]*pc2(1,1),[-4 4]*pc2(2, 1),'r-'); 
hold on 
plot([-2 2]*pe2(1,2),[-2 2]*pe2(2,2),'g-'); 
hold off 
%Set the second component of y to zero, reducing the dimensionality to one. 
y(2,:) = 0; 
% Transform back to the original data. 
x = (y"*inv(pe))'; 
% Plot data. 
figure (5); 
scatter(x(1,:),x(2,:)); 
axis equal; 
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% 

% Filename: pca_ex.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 


% Author: Diogo Pereira, Naval Postgraduate School, Monterey, CA, 2002 
% Modified by: Colin K Lee, Naval Postgraduate school, Monterey, CA, 2004 
% Descriptions: Routine program to compare PCA in classification 

% Inputs: none 

% Outputs: none 

% Function(s) called: none 

% 

a RIE Ee RE ERE oO Ks Ee NESE ETRE SN a Ae Oe eee TE Oe er OM ROE LR 
close all 

clear all 

cle 

rand('seed',0); 


al=rand(1,50); 
a2=rand(1,50); 
b1=rand(1,50); 
b2=rand(1,50); 
c1l=[al;0.5*a2]; 
c2=[1.5+b1;0.5*b2]; 
figure(1) 
plot(c1(1,:),c1(2,:),'b*") 
hold on 
plot(c2(1,:),c2(2,:),'ro') 
axis([0 2 -1 1]) 

A=[cl c2]; 
[W,m,Amean,EVA ]=pca(A, 1); 
W; 

x=linspace(-2,2); 
yl=x*W(1,:)+Amean(1); 
y2=x*W(2,:)+Amean(2); 
plot(yl s¥2,'8-') 

grid 

“example pea not ok 
c1=[4*al;0.5+0.2*a2]; 
c2=[4*b1;0.2*b2]; 
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% 

% Filename: Ida_ex.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 
% Author: MathWorks 


% Modified by: Colin K Lee, Naval Postgraduate School, Monterey, CA, 2004 

% Descriptions: Routine program to demonstrate the use of LDA in classification 
% Inputs: none 

% Outputs: none 

% Function(s) called: none 

% 

IRI ea cP OD ICT eh ee 
cle 

clear 

close all 


c1=[randn(100,1) randn(100,1)+1 randn(100,1)+6]; 
c2=[randn(100,1) randn(100,1) randn(100,1)]; 
figure(1) 

scatter3(c1(:,1),c1(:,2),c1(:,3),'filled') 

hold on; 

scatter3(c2(:,1),c2(:,2),c2(:,3),'r', filled’) 


figure(2) 
scatter(c1(:,1),c1(:,2),'filled’); 
hold on 
scatter(c2(:,1),c2(:,2),'r','filled'); 
axis equal; 

data=[c1;c2]; 
label=[ones(100,1);zeros(100,1)]; 
mode=1; 

if(any(label==0)), 

label=label+1; 

end 

cat=length(unique(label)); 
[n,f]=size(data); 

Sw=zeros(f); 

Sb=zeros(f); 

m=mean(data); 

for i=1:cat, 


[r,c]=find(label==1); 
mg=mean(data(r,:)); 
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ng=length(r); 
Sw=Sw + cov(data(r,:)).*(ng-1); 
Sb=Sb + ng*(mg-m)'*(mg-m) ; 
end 
[v2,d2]=eig(inv(Sw)*Sb); 
drec=2; 
A=v2(:,1:drec); 
dataLDA=data* A; 
data_final=reshape(dataLDA, 100,4); 
figure(3) 
scatter(data_final(:,1),data_final(:,2),'filled'); 
hold on; 
scatter(data_final(:,3),data_final(:,4),'r','filled’); 
hold on; 
plot([-10 4]*v2(1,1),[-10 4]*v2(2,1),'g-'); 
hold on; 
axis equal; 
drec=3; 
Al=v2(:,1:drec); 
dataLDA 1=data*A1; 
data_finall=reshape(dataLDA1,100,6); 
% Rotate the data to the v's 
y=(data*v2')'; 
% Plot data. 
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pcea_vs Ida 


% 

% Filename: pcea_vs_Ida.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 


% Author: Diogo Pereira, Naval Postgraduate School, Monterey, CA, 2002 
% Modified by: Colin K Lee, Naval Postgraduate school, Monterey, CA, 2004 
% Descriptions: Routine program to compare PCA and LDA in classification 

% Inputs: none 

% Outputs: none 

% Function(s) called: pca.m, fld.m 

% 

oR aE GSES Te EE aa eRe ERE RET CCE A ORR OER ee RTE Se ee OE Oe OOO OES 
close all 

clear all 

cle 

% PCA not work 


c1=[4*al;0.5+0.2*a2]; 
c2=[4*b1;0.2*b2]; 
figure(2) 
plot(c1(1,:),c1(2,:),'b*") 
hold on 
plot(c2(1,:),c2(2,:),'ro') 
axis([0 4 -.2 .8]) 

A=[cl c2]; 
[W,m,Amean,EVA ]=pca(A, 1); 
W; 

x=linspace(-2,2); 
yl=x*W(1,:)+Amean(1); 
y2=x*W(2,:)+Amean(2); 
plot(yl »y2,'g-') 

grid 

% LDA works 
c1=[4*al;0.5+0.2*a2]; 
c2=[4*b1;0.2*b2]; 
figure(3) 
plot(c1(1,:),c1(2,:),'b*") 
hold on 
plot(c2(1,:),c2(2,:),'ro') 
axis([0 4 -.2 .8]) 

A=[cl c2]; 
C=[ones(1,50) ones(1,50)*2] 
[W,D]=fld(A,C); 

W; 
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x=linspace(-2,2); 
yl=x*W(1,:)+Amean(1); 
y2=x*W(2,:)+Amean(2); 





plot(y1 sy2,'2-') 
grid 
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pea 
%function [W,m,Amean,Ad,EVA ]=pca(A,n) 


% 

% Filename: pca.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 


% Author: Diogo Pereira, Naval Postgraduate School, Monterey, CA, 2002 
% Modified by: Colin K Lee, Naval Postgraduate school, Monterey, CA, 2004 

% Descriptions: Function computes the pca(Principal Component Analysis) of the 
% data contained in the array A. 

% Inputs: 

% A: k x n data matrix contains each image in each column 

% n: number of data samples 

% Outputs: 

% W: contain n eigenvectors, one in each column 

% m: fraction of the variance in n eigenvectors 

% Amean: mean of the data contained in A in a column 

% Ad: data less the mean Amean, all the data in columns 

% EVA: contain the eigenvalues corresponding to the eigenvectors 
% of W on the diagonal 

% Function(s) called: sortem.m 

% 

OO aS AEN a nl a ree seas eee oe nee 
function [W,m,Amean,Ad,EVA |=pca(A,n) 

cle 

clear all 

close all 


[Adimk,Adimn]=size(A); 
if (nargin==1) 

n=Adimn; 
end 
if (n>Adimn) 

n=Adimn; 

end 
% Amean is the mean of A using the collomns as elements 
Amean=mean(A,2); 
% Ad is the difference between A and Amean 
Ad=A - Amean*ones(1,Adimn); 
%eigenvectors (columns of Vectors) and eigenvalues (diag of Values) 
[EVE,EVA] = eig(Ad'*Ad); 
% obtain index of eigenvalues greater than (0.001 times greater eigenvalue) 
[1]=find(EVA>(0.001*max(max(EVA)))); 
EVAC=zeros(size(EVA)); 
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EVAC(I)=EVA(I); %EVAC (Eigenvalue Conditioned) will have only values greater 
than a minimum value otherwise the value is set to zero 


% Obtain n eigenvectors 
%EVAINV is the matrix containing the inverse of the eigenvalues on the diagonal, it 
contains zero if the eigenvalue was zero 
% and the inveseof the eigenvalue if the element was larger than zero 
EVAINV=zeros(Adimn); % EVAIN has size n x n since n is the maximum number of 
eigenvalues 
EVAINV(D=1./(sqrt(EVA(D)); 
U=A*EVE*EVAINV; 
“Sort the vectors/values according with the absolute value of the eigenvalue EVAC and 
eliminates the collumns corresponding 
% to zero eigenvalues 
[W,EVA]=sortem(U, EVAC); 
if (n>size(W,2)) 
n=size(W,2); 
end 


W=W(:,1:n); 
temp=diag(EVA); 
m=sum(temp(1:n))/sum(temp); 
return 
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sortem 
% function [NV,ND] = sortem(V,D) 


% 

% Filename: sortem.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 


% Author: Diogo Pereira, Naval Postgraduate School, Monterey, CA, 2002 
% Modified by: Colin K Lee, Naval Postgraduate school, Monterey, CA, 2004 
% Descriptions: Function to set threshold value and sort the eigenvalues and 

% eigenvectors in descending order. 

% Inputs: 

% V: eigenvectors 

% D: eigenvalues 

% Outputs: 

% NV: _ sorted eigenvectors 

% ND: _ sorted eigenvalues with values above threshold 

% Function(s) called: none 

% 

EEE ROOT OP TEE ERROR ES EY MEE PRE See METTLE AONE ME NCEE FORCES MEME OE Ae Pee MOE Co eee ME EEE ARS eee TE eS 


function [NV,ND] = sortem(V,D) 


% Sorts the columns of V along with the absolute value of the elements of D and 
% elimnates the column of V corresponding to zero eigenvalue 

dvec = diag(D); obtain the values of the diagonal and insert in a vector 
“eliminating the column on V corresponding to zero on D 

[I]=find(dvec==0); 

dvec(I)=[]; 

V(.D=U; 

NV = zeros(size(V)); 

“sort the elements of dvec in descending order according with the absolute value of dvec 
[L,index_ dv] = sort(abs(dvec)); 

index_dv = flipud(index_ dv); 

dvec=dvec(index_dv); 

“insert the elements of dvec on the diagonal 

ND=diag(dvec); 

“sort the columns of V according with index_dv 

NV=V(-:,index_ dv); 

return 


98 


fld 
% function [W,D]=fld(A,C) 


% 

% Filename: fld.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 


% Author: Diogo Pereira, Naval Postgraduate School, Monterey, CA, 2002 
% Modified by: Colin K Lee, Naval Postgraduate school, Monterey, CA, 2004 

% Descriptions: Function computes the Fisher Linear Discriminant W that 

% maximizes the ratio |W'Sg W\/|[W'SwW| where Sw is 

% the within class scatter matrix, and Sp is the between class scatter 
% matrix 

% Inputs: 

% A: data matrix, containing data in columns 

% C row atray containing numbers representing the classes of 
% the elements in A 

% Outputs: 

% W: weight matrix contains the vectors in columns 

% D matrix containing the eigenvalues on the diagonal 


% Function(s) called: pca.m, sortem.m 


function [W,D]=fld(A,C); 

MeanG=mean(A,2);%MeanG is the general mean of the data in A 
Ad=A-MeanG*ones(1,size(A,2)); %Ad is matrix A minus the mean of A 
ST=Ad*Ad';% ST is the total scatter matrix 

clear Ad 


Nclass=0; %Nclass counts the number of classes on A 


%Compute the variance within all the 
classes%%%%%%%%%%%%%%%%V%%%%M%%%%%%%%Y% 
SW=zeros(size(A,1)); %SW is the within class scatter matrix % Produce sq matrix w/ all 
zeros elements 
while size(C,2)>0, 

w=C(1,1); 

[1]=find(C==w); 

MeanClass=mean(A(:,I),2); 

Ad=A(:,I)- MeanClass*ones(1,size(A(:,D,2)); 

SW=SW-+Ad*Ad'; Yoadds the within class scatter matrix of each class 

A(:,D=[]; %eliminates the data already used 

C(D=[]; “eliminates the number corresponding to the class from C vector 

Neclass=Nclass+1; 
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“%used on debbuging 
SW; 

inv(SW)*SW; 
cond(SW); 
SB=ST-SW; 
cond(SB); 

cond(ST); 


N=min(size(A,1), Nclass-1);%N will be the minimum between the dimension of the data 
and Nclass-1 

[W,EVA]=eigs(SB,SW,N); 

W; 

EVA; 


%Set to zero the eigenvalues not equal to a finite value 
[1]=find(isfinite(EVA)==1); % Obtain the index of the eigenvalues that are finite 
EVAC=zeros(size(EVA)); 

EVAC(D=EVA(D); 

EVA=EVAC;%EVA will contain just the eigenvalues that are finite 


% obtain index of eigenvalues greater than 0.001 times greater eigenvalue 
[I]=find(abs(EVA)>(0.001*max(max(abs(EVA))))); 

EVAC=zeros(size(EVA)); 

EVAC(I)=EVA(I); %EVAC will have only values greater than a minimum value other- 
wise the value is set to zero 

[W,D]=sortem(W,EVAC);% order and eliminates eigenvector corresponding to zero ei- 
genvalue 


% this part normalizes W so that the norm of each column will be one 
NW>=ones(size(W,1),1)*sqrt(sum((W.”2),1)); 

W=W./NW; 

return 
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pca_eig 


% 

% Filename: pca_eig.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 


% Author: Diogo Pereira, Naval Postgraduate School, Monterey, CA, 2002 
% Modified by: Colin K Lee, Naval Postgraduate school, Monterey, CA, 2004 
% Descriptions: Routine program exams the error rate vs. various number of 

% eigenvectors used in PCA-based classification 

% Inputs: 

% A_all: data matrix 

% cnetroids all: centrodis computed from A_all 

% Outputs: Error Rate 

% Function(s) called: none 

% 

Oh AE pacts ec os tees teehee are ie aS a ee ee Sees e 
clear all 

close all 

cle 


% Load the training images 
load A_16 
A=double(A_ 16); 


% Compute the pca of the training images 
[W,m,Amean,Ad]=pca(A); 


% Compute the projection matrix P 
P=W'*(Ad); 


% Computing the centroid of each class 
C=[]; 
C=meanclass(P,T,Person); 


% “% Load the testing images 

load A_5 

A=double(A_5); 
Ad=A-Amean*ones(1,size(A,2)); 
% Compute the projection matrix P 
P=W'*(Ad); 


% Computing the error rate in function to the number of eigenvectors 
dmin=100000000; 
kmax=size(P, 1); 
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N_images=size(P,2); 
for k=1:kmax 
[L,M]=size(P); Y%othere are M objects 
[N,K]=size(C); “there are K classes 
D=zeros(1,M); 
for i=1:M 
dif=C(1:k,:)-P(1:k,i)*ones(1,K); 
[dist,pos]=min(( sum((dif).%2,1) )); 
if dist<dmin 
D(1,1)=pos; 
end 
end 
NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); %Obtain the class number based on the position number 
error=T-D; 
n_error(k)=sum((error~=0),2); 
end 
figure 
plot(n_error/N_images*100,'+b') 
xlabel(‘Number of eigenvectors'); 
ylabel('Error rate (%)'); 
title(‘Error rate versus number of eigenvectors’); 
grid; 
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variation 


% 

% Filename: pca_eig.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 


% Author: Diogo Pereira, Naval Postgraduate School, Monterey, CA, 2002 
% Modified by: Colin K Lee, Naval Postgraduate school, Monterey, CA, 2004 
% Descriptions: Routine program determines distance between each image and 
% all centroids using LDA. 

% Inputs: 

% A_all: data matrix 

% Outputs: min,mean, and max distance of each image to the centroids. 

% Function(s) called: pca.m, fld.m, meanclasse.m 

% 

OF ee ase cie CO ae Se ale Ce a hc Ce eee 
cle 

clear all; 

close all; 

load A_all 


A=double(A_all); 

Person=[1 234568911 12 13 1415 165051 52... 
53 54 55 56 57 58 59 60 61 62 63 64 65 66... 
67 68 69 70 71 72 73 74 75 76 77 78 79 80 ... 
81 82 83 84 85]; 

% load centroids_all 


[W,m,Amean,Ad,EVA |=pca(A); 

% Compute the projection matrix P 

P=W"*(Ad); 

% Compute the projection matrix P obtained with the FLD(Fisher Linear Discriminant) 
[ Wopt,D ]=fld(P,T); 

P=Wopt'*P; 


% Computing the centroid of each class 
C=meanclass(P,T,Person); 
Ad=A-Amean*ones(1,size(A,2)); 
P=W'*(Ad); 

P=Wopt'*P; 


% Classify the testing images 
dmin=100000000; 
ImageClassif=zeros(1,size(P,2)); 
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[L,M]=size(P); Yothere are M objects 
[N,K]=size(C); “there are K classes 
D=zeros(1,M); 
for i=1:M 
dif=C-P(:,1)*ones(1,K); 
[dist,pos]=min(( sum((dif).%2,1) )); 
if dist<dmin; 
D(1,i)=pos; 
end 
end 
NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 


error=T-D; 
%PersonError=Person(T(find(error~=0))) 
PersonError=T(find(error~=0)); 
n_error=sum((error~=0),2); 
Dist=distance(P,C); 
DistWithLabelLDA=[0 Person;T' Dist]; 


wk1 write('DistancesLDA.wk1',DistWithLabelLDA); 
ff=wk lread('DistancesLDA.wk1!'); 


save ff 
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% 

% Filename: kev.m 

% Thesis Advisor: Prof. M.P. Fargues, Naval Postgraduate School, Monterey, CA 
% Thesis CoAdvisor: Prof. G. Karunasiri, Naval Postgraduate School, Monterey, CA 


% Author: Diogo Pereira, Naval Postgraduate School, Monterey, CA, 2002 
% Modified by: Colin K Lee, Naval Postgraduate school, Monterey, CA, 2004 

% Descriptions: Routine program tests the 2 schemes using k-fold cross validation 
% 6 training/4 testing. The number of elements in each class is left 
% constant. 

% Inputs: 

% A_all: data matrix 

% Outputs: 

% Error Rate in % 

% Function(s) called: pca.m, fld.m, meanclass.m 

% 

io lie heel SORIA eset ae es alan DA eek AI A aaa te sgl oe Mee iS OI eee se 
clear all; 

close all; 

cle 


% A_all will contain the training images 

% B will contain the testing images 

% T is the class number of the images on A 
% TB is the class number of the images on B 


load A_all 

Atemp=double(A_all); 

Ttemp=T; 

N_trials=900; 

Person = [1234568911 12 13 1415 1650 51 52 53 54 55 5657 58 59 60 61 62 63 
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85]; 
epct=0; % accumulator of the number of errors per class 

epctW=0; 

epctW 1=0; 

epctW2=0; 

epctW3=0; 

epct80=0; 

epct80W 1=0; 

epct80W2=0; 

epct80W3=0; 

epct50=0; 
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epct50W 1=0; 
epct50OW2=0; 
epct50W3=0; 
epct40W3=0; 


epctLDA=0; 

for j=1:N_ trials 
A = Atemp; 
T = Ttemp; 
BO = [] 
TB = [) 


Remove = []; 
% Generates testing and training sets 
for i=1:150 % i goes from 1 to the total number of pictures 
f=randperm(10); 
L1=(i-1)*10+1; 
L2=i* 10; 
A(:,L1:L2)=A(:,(i-1)*10+f); % mix the samples in each class 
T(:,L1:L2)=T(:,(i-1)*10+f); % adjust the class numbers 
IndSamples=[L1 L1+1 L1+2 L143]; 
B=[B A(:,IndSamples)]; 
TB=[TB T(:,IndSamples)]; 
Remove=[Remove IndSamples]; 
end 
A(:,Remove)=[]; 
T(:,Remove)=[]; 


% PCA 
WWWV%%%%%0 0% %0%%%%V%%%0%0%0 0% VV %%0%0%0%0%0%0%0%%%%%0%0%0% 


% Compute the pca of the training images A 
[W,m,Amean,Ad]=pca(A); 

% Compute the projection matrix P 
P=W'*(Ad); 


%%%% used to do crossvalidation directly without pca or Ida 
PD=A; 

%AmeanD=mean(A,2); 

PD=A-Amean*ones(1,size(A,2)); 


MIM %%%%%0%VVV%%%%%%%%%%MVV%%%%%%0%%%V%%%%%%0% 
% Compute the centroid of each class contained in Person 
C=meanclass(P,T,Person); 


CD=meanclass(PD,T,Person); 
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WWVWV%%%%0%0 0% %0VV%%%%%0%0%0 0% VV %%%0%0%0 00% %%%%%%0%0% 


% Using the testing images B 
Ad=B-Amean*ones(1,size(B,2)); 
% Compute the projection matrix P 
P=W'*(Ad); 

“oP(1,:)=[]; 
% Classify the testing images 
dmin=100000000; 


WW %%%%%%%0%%0%%%%%%%%%%%%%%%0%%0%%%%%%%%%%% 
% classifying the images directly 
PD=Ad; 


[L,M]=size(PD); “there are M objects 
[N,K]=size(CD); “there are K classes 
D=zeros(1,M); 
for i=1:M 

dif=CD-PD(:,1)*ones(1,K); 
[dist,pos]=min(( sum((dif).%2,1) )); 
if dist<dmin 
D(1,i)=pos; 
end 
end 
NZD=find(D~=0); 

D(NZD)=Person(D(NZD)); 

error=TB-D; 

n_errorDirect(j)=sum((error~=0),2); 

“computes the number of errors per class 

errornz=(error~=0); 

epc=reshape(errornz, 12,50); 

epc=sum(epc, |); 

epct=epcttepc; 


WWW %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
Yousing all eigenvectors 


[ww,wh]=size(W); 

Neig=wh; 

Temp1=P(1:Neig,:); 

Temp2=C(1:Neig,:); 

[L,M]=size(Temp1); Yothere are M objects 
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[N,K]=size(Temp2); “there are K classes 
D=zeros(1,M); 
for i=1:M 

dif=Temp2-Temp 1(:,i)*ones(1,K); 

[dist,pos]=min(( sum((dif).%2,1) )); 

if dist<dmin 

D(1,i)=pos; 

end 
end 
NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 
error=TB-D; 
n_errorPCA W(j)=sum((error~=0),2); 
“computes the number of errors per class 
errornz=(error~=0); 
epc=reshape(errornz, 12,50); 
epc=sum(epc, |); 
epctW=epctW+epc; 


WWWV%%%%%0 0% %0V%%%V%%%0%0%0 070% V%V%%%%0%0%0 00% %%%%%0%0%0% 


Y%using all eigenvectors but the first 
[ww,wh]=size(W); 
Neig=wh; 
Temp1=P(2:Neig,:); 
Temp2=C(2:Neig,:); 
[L,M]=size(Temp1); “there are M objects 
[N,K]=size(Temp2); “there are K classes 
D=zeros(1,M); 
for i=1:M 

dif=Temp2-Temp 1 (:,i)*ones(1,K); 

[dist,pos]=min(( sum((dif).%2,1) )); 

if dist<dmin 

D(1,i)=pos; 

end 
end 
NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 
error=TB-D; 
n_errorPCA W1(j)=sum((error~=0),2); 
“computes the number of errors per class 
errornz=(error~=0); 
epc=reshape(errornz, 12,50); 
epcp3=epc(:,101:150); 
epe=[epcp 1 ;epep2;epcp3]; 
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epc=sum(epc, 1); 
epctW 1=epctW 1+epc; 


MWWWV%%%%0% 0% VV V%%V%%%0%0%0 0% %V%%%0%0%0%0%0%0%0%%%%%%0%0% 


“%using all eigenvectors but the first two 
[ww,wh]=size(W); 
Neig=wh; 
Temp1=P(3:Neig,:); 
Temp2=C(3:Neig,:); 
[L,M]=size(Temp1); “there are M objects 
[N,K]=size(Temp2); “there are K classes 
D=zeros(1,M); 
for i=1:M 
dif=Temp2-Temp 1(:,i)*ones(1,K); 
[dist,pos]=min(( sum((dif).%2,1) )); 
if dist<dmin 
D(1,i)=pos; 
end 
end 
NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 
error=TB-D; 
n_errorPCA W2(j)=sum((error~=0),2); 
“computes the number of errors per class 
errornz=(error~=0); 
epc=reshape(errornz, 12, 50); 
epc=sum(epc, |); 
epctW2=epctW2+epc; 


WVWWV%%%0%0%0 0% V0 VV%%V%%%0%0%0 0% VV %%%0%0%0 00% %%%%%%0Y%0% 


“%using all eigenvectors but the first three 
[ww,wh]=size(W); 
Neig=wh; 
Temp1=P(4:Neig,:); 
Temp2=C(4:Neig,:); 
[L,M]=size(Temp1); “there are M objects 
[N,K]=size(Temp2); “there are K classes 
D=zeros(1,M); 
for i=1:M 
dif=Temp2-Temp 1(:,i)*ones(1,K); 
[dist,pos]=min(( sum((dif).%2,1) )); 
if dist<dmin 
D(1,i)=pos; 
end 
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end 

NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 

error=TB-D; 

n_errorPCA W3(j)=sum((error~=0),2); 
“computes the number of errors per class 
errornz=(error~=0); 
epc=reshape(errornz, 12,50); 
epc=sum(epc, |); 

epctW3=epctW3+epc; 


MWWWV%%%%%0 070% VV%%%%%0%0%0 0% VV %%0%0%0%0 00% %%%%%0%0%0% 


%using 80 eigenvectors 
Neig=80; 
Temp1=P(1:Neig,:); 
Temp2=C(1:Neig,:); 
[L,M]=size(Temp1); “there are M objects 
[N,K]=size(Temp2); “there are K classes 
D=zeros(1,M); 
for i=1:M 
dif=Temp2-Temp 1(:,i)*ones(1,K); 
[dist,pos]=min(( sum((dif).%2,1) )); 
if dist<dmin 
D(1,i)=pos; 
end 
end 
NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 
error=TB-D; 
n_errorPCA80(j)=sum((error~=0),2); 
“computes the number of errors per class 
errornz=(error~=0); 
epc=reshape(errornz, 12, 50); 
epc=sum(epc, |); 
epct80=epct80+epc; 


WWWV%%%%%0 0% 0% %%V%%%0%0%0 V0 V%V%%%0%0%0 %0%0%0%%%%%0%Y%0% 


%using 80 eigenvectors without the first eigenvectors 
Neig=80; 

Temp1=P(2:Neig+1,:); 

Temp2=C(2:Neig+1,:); 

[L,M]=size(Temp1); “there are M objects 
[N,K]=size(Temp2); “there are K classes 
D=zeros(1,M); 
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for i=1:M 
dif=Temp2-Temp1(:,i)*ones(1,K); 
[dist,pos]=min(( sum((dif).%2,1) )); 
if dist<dmin 
D(1,i)=pos; 
end 
end 
NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 
error=TB-D; 
n_errorPCA80W1(j)=sum((error~=0),2); 
“computes the number of errors per class 
errornz=(error~=0); 
epc=reshape(errornz, 12,50); 
epc=sum(epc, |); 
epct80W 1=epct80W1+epc; 


WWWV%%%%%0 0% %0V%%%%%0%0% 0% VVV%%%0%0%0 00% %V%%%%0%Y%0% 


%using 80 eigenvectors without the first two eigenvectors 
Neig=80; 
Temp1=P(3:Neig+2,:); 
Temp2=C(3:Neig+2,:); 
[L,M]=size(Temp1); %there are M objects 
[N,K]=size(Temp2); “there are K classes 
D=zeros(1,M); 
for i=1:M 

dif=Temp2-Temp1(:,i)*ones(1,K); 

[dist,pos]=min(( sum((dif).%2,1) )); 

if dist<dmin 

D(1,i)=pos; 

end 
end 
NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 
error=TB-D; 
n_errorPCA80W2(j)=sum((error~=0),2); 
“computes the number of errors per class 
errornz=(error~=0); 
epc=reshape(errornz, 12,50); 
epc=sum(epc, |); 
epct80W2=epct80W2+epc; 


WWW %0%%%%VV%VVVV%VVVVV%V%%%%V%%V%%% 
%using 80 eigenvectors without the first three eigenvectors 
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Neig=80; 
Temp1=P(4:Neig+3,:); 
Temp2=C(4:Neig+3,:); 
[L,M]=size(Temp 1); “there are M objects 
[N,K]=size(Temp2); “there are K classes 
D=zeros(1,M); 
for i=1:M 

dif=Temp2-Temp 1(:,i)*ones(1,K); 

[dist,pos]=min(( sum((dif).%2,1) )); 

if dist<dmin 

D(1,i)=pos; 

end 
end 
NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 
error=TB-D; 
n_errorPCA80W3(j)=sum((error~=0),2); 
“computes the number of errors per class 
errornz=(error~=0); 
epc=reshape(errornz, 12,50); 
epc=sum(epc, |); 
epct80W3=epct80W3+epc; 


%using 50 eigenvectors 
Neig=50; 
Temp1=P(1:Neig,:); 
Temp2=C(1:Neig,:); 
[L,M]=size(Temp 1); “there are M objects 
[N,K]=size(Temp2); “there are K classes 
D=zeros(1,M); 
for i=1:M 

dif=Temp2-Temp 1(:,i)*ones(1,K); 

[dist,pos]=min(( sum((dif).%2,1) )); 

if dist<dmin 

D(1,i)=pos; 

end 
end 
NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 
error=TB-D; 
n_errorPCA50(j)=sum((error~=0),2); 
“computes the number of errors per class 
errornz=(error~=0); 
epc=reshape(errornz, 12,50); 
epc=sum(epc, |); 
epct50=epct50+epc; 
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WWWV%%%%%0 0% VV %%%%%0%0% 0% VV %%%0%0%0 00% %%%%%0%0%0% 


%using 50 eigenvectors without the first eigenvectors 
Neig=50; 
Temp1=P(2:Neig+1,:); 
Temp2=C(2:Neigt1,:); 
[L,M]=size(Temp1); %there are M objects 
[N,K]=size(Temp2); “there are K classes 
D=zeros(1,M); 
for i=1:M 
dif=Temp2-Temp 1(:,i)*ones(1,K); 
[dist,pos]=min(( sum((dif).%2,1) )); 
if dist<dmin 
D(1,i)=pos; 
end 
end 
NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 
error=TB-D; 
n_errorPCA50W1(j)=sum((error~=0),2); 
“computes the number of errors per class 
errornz=(error~=0); 
epc=reshape(errornz, 12,50); 
epc=sum(epc, |); 
epct50W1=epct50W1+epc; 


WWWV%%%%0%0 0% VV %%%%%0%0%0 0% VV V%%%0%0%0 00% %%%%%%0%0% 


%using 50 eigenvectors without the first two eigenvectors 
Neig=50; 
Temp1=P(3:Neig+2,:); 
Temp2=C(3:Neig+2,:); 
[L,M]=size(Temp1); %there are M objects 
[N,K]=size(Temp2); “there are K classes 
D=zeros(1,M); 
for i=1:M 
dif=Temp2-Temp 1(:,i)*ones(1,K); 
[dist,pos]=min(( sum((dif).%2,1) )); 
if dist<dmin 
D(1,i)=pos; 
end 
end 
NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 
error=TB-D; 
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n_errorPCAS50W2(j)=sum((error~=0),2); 
“computes the number of errors per class 
errornz=(error~=0); 
epc=reshape(errornz, 12,50); 
epc=sum(epc, |); 
epctS0W2=epct50W2+epc; 


WWWV%%%%%0 0% 0% V%%%0%0% 0% VV %%%%0%0%0 00% %%%%%%0Y%0% 


%using 50 eigenvectors without the first three eigenvectors 
Neig=50; 
Temp1=P(4:Neig+3,:); 
Temp2=C(4:Neig+3,:); 
[L,M]=size(Temp1); “there are M objects 
[N,K]=size(Temp2); “there are K classes 
D=zeros(1,M); 
for i=1:M 

dif=Temp2-Temp1(:,i)*ones(1,K); 

[dist,pos]=min(( sum((dif).%2,1) )); 

if dist<dmin 

D(1,i)=pos; 

end 
end 
NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 
error=TB-D; 
n_errorPCA50W3(j)=sum((error~=0),2); 
“computes the number of errors per class 
errornz=(error~=0); 
epc=reshape(errornz, 12,50); 
epc=sum(epc, |); 
epctS0 W3=epct50W3+epc; 


WWVWV%%%0%0%0 0% %0V%%V%%%0%0%0 0% VV %%%0%0%0 00% %%%%%0%0%0% 
% LDA 
WWWV%%%%0% 0% VV %%%%0%0%0%0 0% VV V%%%%0%0%0 00% %%%%%0%0%0% 


% Compute the pca of the training images 

N_pictures = 10; 

temp=(N_pictures-1)*length(Person); 

[W,m,Amean,Ad,EVA |=pca(A,temp); % This reduces the dimension to N-c = 36 - 
6=30 the best result was with 29 

% Compute the projection matrix P 

P=W'*(Ad); 
% PL, )=[]; 

% Compute the projection matrix P obtained with the FLD(Fisher Linear Discriminant) 
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[ Wopt,D]=fld(P,T); 

P=Wopt'*P; 

% Computing the centroid of each class 

C=meanclass(P,T,Person); 

% Using the testing data 

Ad=B-Amean*ones(1,size(B,2)); 

P=W"*(Ad); 

% PL, )=[]; 

P=Wopt'*P; 

% Classify the testing images 

dmin=100000000; 

[L,M]=size(P); Y%othere are M objects 
[N,K]=size(C); “there are K classes 
D=zeros(1,M); 
for i=1:M 

dif=C-P(:,1)*ones(1,K); 

[dist,pos]=min(( sum((dif).%2,1) )); 

if dist<dmin 

D(1,i)=pos; 

end 

end 


NZD=find(D~=0); 
D(NZD)=Person(D(NZD)); 

error=TB-D; 
n_errorLDA(j)=sum((error~=0),2); 
“computes the number of errors per class 
errornz=(error~=0); 
epc=reshape(errornz, 12,50); 
epc=sum(epc, |); 
epctLDA=epctLDA+epc; 


%HH%M%%%%%%%%%%%%%%%%%%%%%%%%%%%%M%%%%%%M%%%%% 
end 
%HH%M%%%%%%%%%%%%%%%%%%%%%%%%%%%%M%%%%%%M%%%%% 


N_samples=4*50*3; % we are using 6/4 cross validation 
h1=figure; 

L1=min(n_errorDirect); 

L2=max(n_errorDirect); 

temp 1=linspace(0,L2,L2+1); 
N1=hist(n_errorDirect,temp 1); 
plot(temp1/N_samples*100,N1/N_trials*100,'bd-'); 
title("Histogram of the number of errors’); 
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xlabel('Percentage of errors’); 
ylabel('Percentage of the simulations’); 
grid; 
saveas(h1,'CrossValidationDirect6_4.fig'); 


h2=figure; 

L1=min(n_errorPCAW); 
L2=max(n_errorPCAW); 

temp 1=linspace(0,L2,L2+1); 
N1=hist(n_errorPCAW,temp 1); 
plot(temp1/N_samples*100,N1/N_trials*100,'bd-'); 
title("Histogram of the number of errors’); 
xlabel('Percentage of errors’); 
ylabel('Percentage of the simulations’); 

grid; 
saveas(h1,'CrossValidation_ all eigen _6_4.fig'); 


h3=figure; 

L1=min(n_errorPCAW1); 
L2=max(n_errorPCAW1); 

temp 1=linspace(0,L2,L2+1); 
N1=hist(n_errorPCAW1,temp1); 
plot(temp1/N_samples*100,N1/N_trials*100,'bd-'); 
title("Histogram of the number of errors’); 
xlabel('Percentage of errors’); 

ylabel('Percentage of the simulations’); 

grid; 

saveas(h3,'CrossValidationall eigenW1_6_4.fig'); 


h4=figure; 

L1=min(n_errorPCAW2); 
L2=max(n_errorPCAW2); 

temp 1=linspace(0,L2,L2+1); 
N1=hist(n_errorPCAW2,temp1); 
plot(temp1/N_samples*100,N1/N_trials*100,'bd-'); 
title("Histogram of the number of errors’); 
xlabel('Percentage of errors’); 

ylabel('Percentage of the simulations’); 

erid; 

saveas(h4,'CrossValidationall eigenW2_6_4.fig'); 


h5=figure; 
L1=min(n_errorPCAW3); 
L2=max(n_errorPCAW3); 

temp 1=linspace(0,L2,L2+1); 
N1=hist(n_errorPCAW3,temp1); 
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plot(temp1/N_samples*100,N1/N_trials*100,'bd-'); 
title("Histogram of the number of errors’); 
xlabel('Percentage of errors’); 

ylabel('Percentage of the simulations’); 

grid; 
saveas(h5,'CrossValidationall eigenW3_ 6 _4.fig'); 


h6=figure; 

L1=min(n_errorPCA80); 
L2=max(n_errorPCA80); 

temp 1=linspace(0,L2,L2+1); 
N1=hist(n_errorPCA80,temp 1); 
plot(temp1/N_samples*100,N1/N_trials*100,'bd-"); 
title("Histogram of the number of errors’); 
xlabel('Percentage of errors’); 
ylabel('Percentage of the simulations’); 
grid; 
saveas(h6,'CrossValidation80eig 6 4.fig'); 


h7=figure; 

L1=min(n_errorPCA80W1); 
L2=max(n_errorPCA80W1); 

temp 1=linspace(0,L2,L2+1); 
N1=hist(n_errorPCA80W 1,temp1); 
plot(temp1/N_samples*100,N1/N_trials*100,'bd-'); 
title("Histogram of the number of errors’); 
xlabel('Percentage of errors’); 
ylabel('Percentage of the simulations’); 

grid; 

saveas(h7,'CrossValidation80W leig_6_4.fig'); 


h8=figure; 

L1=min(n_errorPCA80W2); 
L2=max(n_errorPCA80W2); 

temp 1=linspace(0,L2,L2+1); 
N1=hist(n_errorPCA80W2,temp1); 
plot(temp1/N_samples*100,N1/N_trials*100,'bd-'); 
title("Histogram of the number of errors’); 
xlabel('Percentage of errors’); 
ylabel('Percentage of the simulations’); 

grid; 
saveas(h8,'CrossValidation|00W2eig 6 4.fig'); 
h9=figure; 

L1=min(n_errorPCA80W3); 
L2=max(n_errorPCA80W3); 
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temp 1=linspace(0,L2,L2+1); 
N1=hist(n_errorPCA80W3,temp1); 
plot(temp1/N_samples*100,N1/N_trials*100,'bd-'); 
title("Histogram of the number of errors’); 
xlabel('Percentage of errors’); 

ylabel('Percentage of the simulations’); 

grid; 
saveas(h9,'CrossValidation80W3eig_6_4.fig'); 


h10=figure; 

L1=min(n_errorPCA50); 
L2=max(n_errorPCA50); 

temp 1=linspace(0,L2,L2+1); 
N1=hist(n_errorPCA50,temp1); 
plot(temp1/N_samples*100,N1/N_trials*100,'bd-'); 
title("Histogram of the number of errors’); 
xlabel('Percentage of errors’); 
ylabel('Percentage of the simulations’); 

grid; 
saveas(h10,'CrossValidation50eig 6 4.fig'); 


h11=figure; 

L1=min(n_errorPCAS50W1); 
L2=max(n_errorPCA50W1); 

temp 1=linspace(0,L2,L2+1); 
N1=hist(n_errorPCA50W1,temp1); 
plot(temp1/N_samples*100,N1/N_trials*100,'bd-'); 
title("Histogram of the number of errors’); 
xlabel('Percentage of errors’); 
ylabel('Percentage of the simulations’); 

grid; 

saveas(h11,'CrossValidationS50Wleig 6 4.fig'); 


h12=figure; 

L1=min(n_errorPCA50W2); 
L2=max(n_errorPCA50W2); 

temp 1=linspace(0,L2,L2+1); 
N1=hist(n_errorPCA50W2,temp1); 
plot(temp1/N_samples*100,N1/N_trials*100,'bd-'); 
title("Histogram of the number of errors’); 
xlabel('Percentage of errors’); 
ylabel('Percentage of the simulations’); 

grid; 

saveas(h12,'CrossValidation50OW2eig_ 6_4.fig'); 


h13=figure; 
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L1=min(n_errorPCA50W3); 
L2=max(n_errorPCA50W3); 

temp 1=linspace(0,L2,L2+1); 
N1=hist(n_errorPCA50W3,temp1); 
plot(temp1/N_samples*100,N1/N_trials*100,'bd-'); 
title("Histogram of the number of errors’); 
xlabel('Percentage of errors’); 

ylabel('Percentage of the simulations’); 

grid; 
saveas(h13,'CrossValidation50W3eig_6_4.fig'); 


h14=figure; 

L1=min(n_errorLDA); 
L2=max(n_errorLDA); 
temp2=linspace(0,L2,L2+1); 
N2=hist(n_errorLDA,temp2); 
plot(temp2/N_samples*100,N2/N_trials*100,'rd-'); 
title("Histogram of the number of errors’); 
xlabel('Percentage of errors’); 
ylabel('Percentage of the simulations’); 

grid; 
saveas(h14,'CrossValidationLDA_6_4.fig'); 


WWW %%%0% 0% %0VV%%V%%%0%0%0 0% VV %%%%0%0%0%0%0%0%%%%%% 0% 


% saving the error per class 
ns=((N_samples/50)*N_trials)/100; 

epct=epct/ns; 

epctW=epctW/ns; 

epctW=epctW 1/ns; 

epctW=epctW2/ns; 

epctW=epctW3/ns; 

epct80=epct80/ns; 

epct80W 1=epct80W1/ns; 
epct80W2=epct80W2/ns; 
epct80W3=epct80W3/ns; 

epct50=epct50/ns; 

epct50W 1=epct50W1/ns; 
epctSOW2=epct50W2/ns; 
epct50W3=epct50W3/ns; 

epctLDA=epctLDA/ns; 
M=[Person;epct;epctW;epctW | ;epctW2;epctW3;epct80;epct80W 1 ;epct80W2;epct80W3; 
epct50;epctS0W 1;epct50W2;epct50W3;epctLDA ]; 


%t='Errors per class 60%-train-40%test'; 
filename1l='ErrorPerClass6 4perc’; 
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%wk1write(filename,t, 1,1); 
%wk1write(filename,Person,3,2); 
wk1write(filename!,M); 

dd1=wk |read(filename 1); 

save ddl 


ns=100/((N_samples/50)*N_ trials); 

epct=epct/ns; 

epctW=epctW/ns; 

epctW 1=epctW1/ns; 

epctW2=epctW2/ns; 

epctW3=epctW3/ns; 

epct80=epct80/ns; 

epct80W 1=epct80W1/ns; 
epct80W2=epct80W2/ns; 
epct80W3=epct80W3/ns; 

epct50=epct50/ns; 

epctSOW 1=epct50W1/ns; 
epctSO0W2=epct50W2/ns; 
epct50W3=epct50W3/ns; 

epctLDA=epctLDA/ns; 
M=[Person;epct;epctW;epctW | ;epctW2;epctW3;epct80;epct80W | ;epct80W2;epct80W3; 
epct50;epctS0W 1;epct50W2;epct50W3;epctLDA ]; 


%t='Errors per class 60%-train-40%test'; 
filename2='ErrorPerClass6_4abs'; 
%wk1write(filename,t, 1,1); 
%wk1write(filename,Person,3,2); 

wk write(filename2,M); 

dd2=wk |read(filename 1); 

save dd2 


WWWV%%%%0%0 0% %0V%%%%%%0%0%0 VY VV %%%%0%0%0%0%0%0%%%%%%0%0% 
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APPENDIX B. SIMULATION RESULTS 


This appendix contains simulations results obtained from k-fold cross validation 
implementation for different numbers of class combinations. All simulations used 200 it- 
erations. For example, “Mean and Median Error Rate with 14 Classes and 200 Iterations” 
represents mean and median error rates using 14 classes for PCA and LDA-based 


schemes. 


Mean and Median Error Rate with 14 Classes and 200 Iterations: 


Pt 45.58] 18.04] 14.04] 14.83] 20.13] 23.58] 0.00) 
Pt 38.04] 10.33] 14.92] 12.00] 10.29] 12.50] 0.00) 
PAY 14. 83] 0.00] 0.00] 0.00] 0.00] 0.13] 0.00 
PST 0.04] 31.50] 31.67] 19.42] 10.63] 22.75] 0.00 
PY 18.13] 14.04] 13.88] 4.33] 5.50] 8.46] 2.08 
P11 96] 21.88] 22.46] 16.38] 15.04] 20.75] 0.58 
P0383] 0.00] 0.21 0.25] 2.92] 21.46] 0.04 
P12] 8.63] 4 25] 79458] 575] 12.50] 0.00. 
| 13] 76.88] 33.75] 35.67] 25.04] 31.33] 21.04] 0.96 
Pe 4] 558] 0.33] 213] 0.63] 1.96] 454] 0.00) 
| mean | 17.88] 13.19] 12.34] 8.51] 10.03] 14.31] 0.62 
| median | 12.67] 14.13] 12.15] 5.21] 7.29] 12.50] 0.10 
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Mean and Median Error Rate with 15 Classes and 200 Iterations: 


P4479] 16.67] 12.54] 12.42] 17.33] 19.13] 0.00) 
| | 86.96] 17.50] 20.46] 18.17] 771] 9.21] 0.00) 
PAT 15.29] 0.00] 0.00] 0.04] 0.04] 0.04] 0.00) 
PST .00] 13.33] 19.13] 14.79] 14.58] 13.83] 0.29 
pT tro] 8.38] 11.04f 1.29] 17] 8.29] 1.42 
PBT 12.04) 22.13] 22.71] 16.63] 15.92] 24.25] 1.21 
p50] 0.00] 1.08] 0.83] 2.21] 17] 0.00 
P4450] 0.00] 0.04] 0.29] 0.75] 2.67] 0.00) 
| 50] 0000.29] 3.04] 3.13] 2 67] 31.29] 0.00) 
| mean | 16.38] 11.30] 14.04] 8.24] 9.08] 12.13] 0.42 
| median | 12.04] 8.38] 7.24] 3.33] 5.88] 8.29] 0.08 





Mean and Median Error Rate with 16 Classes and 200 Iterations: 


pt 45.20] 24.08] 16.25] 17.74] 14.79] 21.63] 0.00 
| | 38.63] 22.79] 18.46] 19.67] 24.04] 12.88] 0.00 
| T1483] 0.04] 0.00] 0.00] 0.00] 0.08] ___—0~.00 
PST 0.00] 13.33] 19.00] 20.17] 9.38] 9.00] 0.33] 
PY 724] 975] 8.29] 4719.00] 1.00 
| 8] 12.06] 22.71] 25.04] 21.46] 17.20] 23.46] 146 
p75] 0.00] 0.21138] 0.96] 1.21] _0.00 
pt] 463] 0.04] 0.04 0.00] 0.00] 2.42] 0.00 
p15] 163] 446] 163] 47] 0.42 0.83]___—0-.00 
PST] 0.88] 646] 7.04] 2.96] 7.67] 11.58] 0.00 
| mean | 15.61] 12.29] 11.36] 11.93] 10.41] 11.08] 0.34 
| median | 9.25] 6.83] 8.58] 10.35] 7.00] 9.00] 0.04 
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Mean and Median Error Rate with 17 Classes and 200 Iterations: 


pT 45.67] 34.83] 26.63] 29.63] 18.04] 23.92] 0.00 
pT 88.79] 21.96] 19.42) 2417] 18.42] 13.38] 0.00 
po AT 15.54] 0.54] 0.00] 0.00] 0.25] 0.79] 0.00) 
pT 16.92] 5.54] 15.00] 14.13] 6.13] 8.08] 0.08 
PBT 13.67] 21.79] 23.21] 27.33] 16.92] 23.63] 1.46 
p50] 0.00] 0.00] 1.50] 0.96] 1.21] 0.00 
p23. 75] 8.38] 38.38] 7.08] 3.83] 5.00] 0.00) 
p14 4.08] 0.08] 0.04] 0.33] 7] 167] 0.00) 
p15] AT 2.92) 213] 379] 7] 1.50] 0.00) 
PSO] 0.25] 0.92] 3.50] 9.46] 9.63] 19.63] 0.00) 
po SAT SAP 7438] 3.75] 47] 11.88] 0.00) 
p52] 0.00] 9.33] 14.96] 25.25] 20.63] 17.08] 0.04 
| mean | 14.87] 12.45] 12.25] 16.23] 11.64] 12.29] 0.29 
| median | 4.17] 5.54] 6.50] 11.54) 74] 11.88] 0.00 





Mean and Median Error Rate with 18 Classes and 200 Iterations: 


Pt 4679] 31.33] 20.46] 29.67] 19.42] 24.08] 0.00 
Pp ] 38.67] 28.29] 13.42] 26.04] 24.00] 13.71] 0.00 
PAY 15.33] 0.58] 0.00] 0.00] 0.33] 0.92] 02.00 
pS] .00] 14.25] 24.25] 20.08] 15.92] 12.83] 0.46 | 
|] 16.79] 5.79] 13.46] 14.54] 5.67] 6.92] 0.17] 
P1274] 24.29] 23.08] 27.29] tt] 25.47] 1.79) 
p63] 0.00] 0.00] 3.38] 0.67] 0.83]__—0.00 
p12] 3.08] 263] 250] 6.24] 3.13] 4 88]. 
ptt 5.08] OAT] 0.13] 0.33] 0.00] 1.54] 0.00 
p16] 70.96] 3.58] 8.21] 6.46] 6.88] 0.75 
pO} 7] 492] 575] 13.25] 11.88] 22.29] 0.00 
pT] 791.33] 2.83] 5.25] 5.04] 11.46] 0.00 
p52] 0.00] 9.29] 12.50] 22.46] 21.25] 15.54] 0.00 
| mean | 14.22] 13.13] 12.74] 17.44] 12.94] 13.56] 0.58 | 
| median | 4.63] 7-54] 9.74] 13.90] 9.17] 12.15] 0.02 
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Mean and Median Error Rate with 19 Classes and 200 Iterations: 


pT 89.17] 34.50] 16.17] 26.08] 23.92] 14.71] 0.00 
po AT 16.83) 1.13] 0.00] 0.04] 1.08] 117] 0.00 
pT .00] 10.33] 24.63] 23.58] 14.54] 12.63] 0.33 
PGT 17.50] 5.96] 12.21] 15.08] 6.21] 8.46] 0.04 
PBT 12.17] 20.88] 22.46] 26.63] 20.13] 26.79] 1.42 
P0388] 0.00] 0.00] 1.63] 0.92] 1.25] 0.00 
p12] 38.88] 3.08] 279] 642] 7t] 5.92] 0.00 
p14 5.58] 0.54] 0.33] 0.92] 0.08] 417] 0.00 
po 5] 88] 3.54] 10.17] 9.38] 5.08] 4.79] 0.00 
PSO] 413] 5.63] 6.50] 16.08] 11.21] 25.54] 0.00 
po SAT 92] 0.88] 2.38) 7.92) 7.13] 12.42] 0.00 
p52] 0.00] 5.75] 7.83] 16.71] 18.08] 16.71] 0.00 
p S4T 0.00] 0.33] 79] 1.29] 5.67] 9.13] 0.08 
| mean | 13.95] 12.77] 12.02] 16.52] 12.78] 14.15] 0.67 
| median | 5.58] 5.75] 7.83] 15.08] 7.13] 12.42] 0.04 
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Mean and Median Error Rate with 20 Classes and 200 Iterations: 


pT 16.38] 1.63] 0.00] 0.29] 0.38] 71] 0.00 
pS] 0.00] 11.58] 28.50] 26.92] 16.96] 14.13] 0.33] 
P1671] 5.33] 1154] 14.04 5.04) 710.13] 
p81 54] 20.38] 22.79] 30.54] 21.88] 27.50] 1.42 | 
P42] 0.00] 0.00] 0.54] 0.38] 0.63]. 
p12] 3.38] 3.04] 288] 6.58] 446] 6.00] 0.00 
P4240 13] 7] 0.25] 4 67] 0.00 
P15] 1.33] 7.08] 12.71] 10.38] 5.38] 3.75] 0.00 
pO] 4.33] 6.33] 746] 15.92] 14.08] 27.71] 0.00 
pT] 1.50] 146] 3.04] 7.38] 6.92] 11.38] 0.00 
p52] 0.00] 5.00] 6.04] 18.38] 20.42] 16.67] 0.00 
pA} 0.00] 0.33] 0.63] 2.33] 6.54] 11.92] 0.04 
| mean | 13.05] 13.34] 13.08] 18.21] 13.79] 14.99] 0.76 
| median | 5.23] 6.71] 10.19] 14.98] 10.69] 12.42] 0.06 
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Mean and Median Error Rate with 25 Classes and 200 Iterations: 


47.13 26.25 
6 


17.9 


| _0.00) 
| 0.00] 0.00] 02.08] 0.00) 
| 0.00] 15.50] 34.88] 30.17] 22.42] 16.08] 0.50 
| 0.92] 0.04] 0.20] 0.75] 0.42] 0.79] 02.00 
| 6.04] 0.83] 1.33] 4 t7] 0.29] 3.79] 0.00 
| tt] 813] 7.50] 8.88] 146] 0.79] 02.00 
| 192] 0.96] 2.08] 446] 6.96] 14.29] 0.08] 
PAY 25] 67] 2.04] 2.25] 10.04] 16.50] 0.00 
pO} 11.25] 28.21] 38.21] 40.29] 30.50] 36.33] 9.38) 
| mean | 12.90] 16.12] 16.56] 20.20] 16.10] 17.26] ‘1.96 | 
| median | 6.04] 14.88] 12.00] 15.88] 17.33] 14.79] 0.33] 
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Mean and Median Error Rate with 30 Classes and 200 Iterations: 


PAT 16.58] 2.38] 0.00] 0.00] 0.04] 0.00] 0.00) 
pT .00] 16.17] 36.29] 30.17] 26.00] 17.17] 0.25 
pT 18.71] 5.29] 12.63] 10.96] 5.54] 7.67] 0.04 
PBT 11.38] 26.42] 36.21] 41.50] 33.67] 37.08] 3.88 
p75] 0.08] 0.047] 0.08] 1.08] 0.21 
p12] 38.83] 2.96] 5.29] 7.00] 1.04] 2.08] 0.00 
p14 5.33] 72.08] 38.96] 71] 3.08] 0.00 
p15] 4210.25] 8.67] 8.25] 67] 1.13) 0.00) 
P52] 0.00] 71] 9.75] 14.88] 21.42] 8.04) 0.29) 
p53] 20-67] 29.83] 33.21] 34.54] 32.38] 37.17] 9.96) 
PSAP tT 8.08] 7] 9.00] 11.83] 21.42) 0.17, 
| 60] 16.92] 38.67] 43.33] 45.63] 36.96] 34.63] 7.04 
| mean | 14.76] 19.40] 20.68] 23.78] 19.13] 21.59] 2.60) 
| median | 9.73] 15.42] 17.73] 20.90] 18.27] 18.73] 1.40 
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Mean and Median Error Rate with 35 Classes and 200 Iterations: 


pT 54] 3.88] 0.00] 0.00] 0.00] 0.00]__—0-.00 
p17. 96] 5.67] 12.08] 12.00] 6.21] 5.75] 0.00 
| 8] 13.33] 26.67] 36.04] 39.25] 30.25] 34.21] 4.38) 
p63] 0.00] 0.04] 79] 17] 0.96] 0.17 
p12] 425] 8.54] 6.00] 6.83] 0.63] 2.25] 0.13] 
p45 5B] 158] 2203.58] 0.04] 3.24] 0.00 
p15] 192] 13.63] 9.33] 7.33] 0.92] 1.08] 0.00 
pT] 258] 258] 442] 9.29] 9.96] 14.54] 0.00 
| 60] 23.20] 40.00] 46.20] 57.13] 39.83] 37.25] 6.79 
| 6G] 15.04] 15.67] 22.54] 53.75] 57.33) 60.54] 5.50 
| 68] 4317] 54.21] 44.50] 50.29] 43.04] 45.46] 20.38] 
| 69} 26.21] 31.25] 44.21] 42.83] 46.50] 47.83] 4.83] 
| mean | 17.35] 22.19] 23.85] 28.48] 22.68] 25.21] 3.60 
| median | 14.29] 16.50] 22.54] 27.54] 20.79] 25.71] 2.04 
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Mean and Median Error Rate with 40 Classes and 200 Iterations: 


pT 16.54] 2.96] 0.00] 0.00] 0.25] 0.00]__—0-.00 
| 19.75] 6.75] 12.83] 11.33] 7.83] 7.58] 0.00 
| 8] 14.54] 27.58] 40.04] 38.38] 34.20] 35.92] 4.71 
p75] 0.04] 0.00] 0.75] 0.33] 0.58] 0.13 
PY 2275] SAT OT7| 4.25] 0.00 
p15] 1.58] 12.79] 10.96] 8.92] 0.96] 1.24] 0.04 | 
p16] 5.04] 3.08] 5.63] BIT 7798.29] 0.96 
| 60} 32.04] 45.63] 51.96] 57.83] 41.42] 37.83] 7.75 
| 6G] 779] 17.38] 28.50] 55.08] 59.79] 61.92 5.71 
| 68] 43.08] 56.75] 46.92] 48.21] 42.08] 46.96] 20.33] 
| 6} 30.71] 35.13] 44.08] 42.71] 48.96] 51.88] 5.96 
| mean | 19.25] 23.78] 26.01] 29.15] 24.86] 27.58] 4.08 | 
| median | 16.06] 19.17] 25.33] 29.52] 24.85] 27.48] 2.71 
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Mean and Median Error Rate with 45 Classes and 200 Iterations: 


PAY 54] 3.83] 0.04] 0.00] 0.21] 0.04]___—0+.00 
pS] 0.00] 17.00] 31.20] 28.83] 26.79] 23.25] 0.00 
|] 18.50] 5.83] 9.33] 9.25] 8.04] 8.46] 0.13 
| 8] 12.25] 25.88] 39.46] 34.25] 31.88] 36.63] 4.13 
| 790.00] 0.00] 0.38] 0.33] 740.25 
P1675] 26.38] 21.79] 22.46] 5.50] 7.08] 0.96 | 
p14] 550] 75] 167] 3.88] 0.00] 3.54] 0.00 
P15] 2.21] 13.08] 14.38] 12.79] 1.75] 1.04] 0.00 
| 60] 33.75] 49.88] 49.96] 50.04] 42.50] 38.00] 12.75) 
| 63] 18.21] tt] 46.00] 64.04] 50.71] 59.42] 6.96 
| 6] 21.92] 25.47] 35.50] 55.83] 60.38] 61.58] 6.29 
| 68] 55.38] 63.50] 45.42] 45.38] 43.25] 49.75] 18.92] 
| 6} 8471] 37.42] 44.21] 42.88] 50.92] 54.71] 4.08] 
| 80] 28.83] 29.83] 31.63] 34.13] 35.75] 37.25] 3.25) 
| mean | 21.48] 25.97| 27.31] 29.44] 27.11] 30.96] 4.78 | 
| median | 18.13] 25.17] 27.42] 29.50] 27.54] 29.92] 3.79 
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Mean and Median Error Rate with 50 Classes and 200 Iterations: 


a taa6| 4750.3] 0.21 0.38] 0.00] 0.00} 
[__8[_ 0.00] te.67|___26.33| ___26.13| 27.17] 22.83] 0.04] 
[6 79.08] 6.00___7.36|___7.54[ 8.96 9.04] 0.17] 
[8 73.63__27.25|__a7.t3|___a2.71[ 30.79] 33.86] 4.04] 
[90.38] 0.13000] 0.13| 0.04] 0.13] 0.63] 
tel a75[ 27979] ___8.B8|__9.33| 9.96] 0.92] 
[_60[ 41-79] 56.08| _65.04[ __52.20| 48.74 42.46] 12.54] 
[_66[ 19.29] 23.25[ 47.83 65.50| 69.83] 67.42] 6.96] 
[6853.79 62.21| __42.67| __44.08| 42.54] 49.46] 18.50} 
[69] 33.63] 36.96] 44.21] 42.96 54.04] 65.58] 6.13] 
[_80[26.13|___27.29[ ___28.42| __31.00| 32.54] 34.17] 3.79] 
Pmean | 22.39| _26.66| _27.81| _30.02| __28.77| 32.57] 5.39] 
fmedian| 19.25] 26.66| 27.51] _30.02| __28.75| 32.57] 4.04] 
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Mean and Median Error Rate with 15 Classes, without Class #13, and 200 Itera- 
tions: 


PG 7124.24] 15.13] 20.29] 14.08] 19.24] 0.00 
pt 36.63] 23-47] 19.21] 23.46] 21.96] 12.17] 0.00 
PAY 15.42] 0.04] 0.00] 0.00] 0.00] 0.04] 0.00 
pS] 0.00] 10.96] 17.67] 25.29] 10.20] 12.46] 0.04 
| 16.50] 5.58] 9.33] 9.42] 3.67] 9.67] 0.71 
p 8] 10.58] 17.96] 19.21] 25.50] 17.13) 20.88] 0.29) 
p75] 0.00] 0.29] 2.33] 0.88] 1.88] 0.00 
p14] 438] 0.00] 0.00] 0.33] 0.13] 3.13] 0.00 
pt] 7] 8.50] 158] 3.42] 0.25] 1.75] 0.00) 
PST] 0.96] 6.54] 7.13] 6.58] 5.33] 15.58] 0.00 
mean | 11.50] 8.74] 8.77] 12.18] 8.78] 11.36] 0.33 | 
median | 7-87] 6.50] 8.10] 10.80] 6.65] 10.51] 0.04 





Mean and Median Error Rate with 20 Classes, without Class #13, and 200 Itera- 
tions: 


Pe} 87.42] 35.04] 779 28.71] 21.88] 12.75| 0.00 
PAY 16.38] 1.75] 0.00] 0.00] 0.13] 0.17] 0.00 
pS] 0.00] 10.04] 26.75] 26.50] 16.04] 17.13] 0.00 
|] 16.63] 5.47] 10.17] 12.38] 4.42] 4.75] 0.00 
| 8] 10.58] 19.42] 24.75] 29.33] 21.46] 26.24] 1.17] 
p42] 0.04] 0040.58] 17] 0.46] 0.00 
P1317] 13.25] 6.96] 9.08] 425] 4 67] 0.58 
p12] 84279] 3.08] 5.92] 4 17] 479] 0.00 
p14] 4B] 0.21] 0.08] 1.04] 0.13] 3.63] 0.00 
p50] 4.00] 6.04] 6.88] 16.58] 13.88] 27.96] 0.00 
pt] 0.88] 0.88] 1.50] 7.83] 6.50] 13.21] 0.00 
PAT 0.00] 0.13] 17] 2.58] 7112.83] 0.00 
| mean | 11.05] 10.17|_11.05| 15.49] 11.72] 14.12] 0.79 
| median | 4.29] 7.77] 8.56] 11.67] 7.98] 12.79] 0.02 
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Mean and Median Error Rate with 25 Classes, without Class #13, and 200 Itera- 
tions: 


| AT 16.08] 2.33] 0.00] 0.00] 0.04] 0.00] 0.00 
pT .00] 14.20] 29.29] 27.29] 19.58] 13.96] 0.08 
| GT 18.63] 6.04] 11.75] 11.88] 4.08] 5.33] 0.00) 
PBT 12.13) 23.71] 29.25] 40.04] 29.42] 35.33] 2.04 
P7900] 7] 0.38] 0.25] 0.96] 0.00) 
P4725] 1.08] 0.92] 3.83] 0.38] 4.00] 0.00) 
p15] 1.33] 8.63] 7.08] 7.21] 0.83] 0.50] 0.00 
PSAP 275] 217] 3.83] 11.25] 17.46] 0.00) 
p55] 9G] 17.92] 34.71] 37.88] 33.42] 37.67] 3.08 
PO] 11-58] 25.47] 35.42) 39.71] 27-67] 33.13] 6.38 
| mean | 9.88] 12.61] 14.23] 18.41] 14.70] 17.66] 1.57, 
| median | 4.88] 8.92] 11.42] 15.54] 12.58] 17.17] 0.33 
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